5.9. Three-Tier Server Scheme: Development, Staging, and Production
To facilitate transfer from the development
server to the production server, the code should be free of any
server-dependent variables. This will ensure that modules and scripts
can be moved from one directory on the development machine to another
directory (possibly in a different path) on the production machine
without problems.
If two simple rules are followed, server dependencies can
be safely isolated and, as far as the code goes, effectively ignored.
First, never use the server name (since development and production
machines have different names), and second, never use explicit base
directory names in the code. Of course, the code will often need to
refer to the server name and directory names, but we can centralize
them in server-wide configuration files (as seen in a moment).
By trial and error, we have found that a three-tier (development,
staging, and production) scheme works best:
Development
The development tier
might include a single machine or several machines (for example, if
there are many developers and each one prefers to develop on his own
machine).
Staging
The staging
tier is generally a single machine that is basically identical to the
production machine and serves as a backup machine in case the
production machine needs an immediate replacement (for example,
during maintenance). This is the last station where the code is
staged before it is uploaded to the production machine.
The staging machine does not have to be anywhere near as powerful as
the production server if finances are stretched. The staging machine
is generally used only for staging; it does not require much
processor power or memory since only a few developers are likely to
use it simultaneously. Even if several developers are using it at the
same time, the load will be very low, unless of course benchmarks are
being run on it along with programs that create a load similar to
that on the production server (in which case the staging machine
should have hardware identical to that of the production machine).
Production
The production
tier might include a single machine or a huge cluster comprising many
machines.
You can also have the staging and production servers running on the
same machine. This is not ideal, especially if the production server
needs every megabyte of memory and every CPU cycle so that it can
cope with high request rates. But when a dedicated machine just for
staging purposes is prohibitively expensive, using the production
server for staging purposes is better than having no staging area at
all.
Another possibility is to have the staging environment on the
development machine.
So how does this three-tier scheme work?
Developers write the code on their machines (development tier) and
test that it works as expected. These machines should be set up with
an environment as similar to the production server as possible. A
manageable and simple approach is to have each developer running his
own local Apache server on his own machine. If the code relies on a
database, the ideal scenario is for each developer to have access to
a development database account and server, possibly even on their own
machines.
The pre-release manager installs the code on the staging tier machine
and stages it. Whereas developers can change their own
httpd.conf files on their own machines, the
pre-release manager will make the necessary changes on the staging
machine in accordance with the instructions provided by the
developers.
The release manager installs the code on the production tier
machine(s), tests it, and monitors for a while to ensure that things
work as expected.
Of course, on some projects, the developers, the pre-release
managers, and the release managers can actually be the same person.
On larger projects, where different people perform these roles and
many machines are involved, preparing upgrade packages with a
packaging tool such as RPM becomes even more important, since it
makes it far easier to keep every machine's
configuration and software in sync.
Now that we have described the theory behind the three-tier approach,
let us see how to have all the code independent of the machine and
base directory names.
Although the example shown below is simple, the real configuration
may be far more complex; however, the principles apply regardless of
complexity, and it is straightforward to build a simple initial
configuration into a configuration that is sufficient for more
complex environments.
Basically, what we need is the name of the machine, the port on which
the server is running (assuming that the port number is not hidden
with the help of a proxy server), the root directory of the web
server-specific files, the base directories of static objects and
Perl scripts, the appropriate relative and full URIs for these base
directories, and a support email address. This amounts to 10
variables.
We prepare a minimum of three Local::Config
packages, one per tier, each suited to a particular
tier's environment. As mentioned earlier, there can
be more than one machine per tier and even more than one web server
running on the same machine. In those cases, each web server will
have its own Local::Config package. The total
number of Local::Config packages will be equal to
the number of web servers.
For example, for the
development tier, the configuration
package might look like Example 5-3.
Example 5-3. Local/Config.pm
package Local::Config;
use strict;
use constant SERVER_NAME => 'dev.example.com';
use constant SERVER_PORT => 8000;
use constant ROOT_DIR => '/home/userfoo/www';
use constant CGI_BASE_DIR => '/home/userfoo/www/perl';
use constant DOC_BASE_DIR => '/home/userfoo/www/docs';
use constant CGI_BASE_URI => 'http://dev.example.com:8000/perl';
use constant DOC_BASE_URI => 'http://dev.example.com:8000';
use constant CGI_RELATIVE_URI => '/perl';
use constant DOC_RELATIVE_URI => '';
use constant SUPPORT_EMAIL => 'stas@example.com';
1;
The constants have uppercase names, in accordance with Perl
convention.
The configuration shows that the name of the development machine is
dev.example.com, listening to port 8000. Web
server-specific files reside under the
/home/userfoo/www directory. Think of this as a
directory www that resides under user
userfoo's home directory,
/home/userfoo. A developer whose username is
userbar might use
/home/userbar/www as the development root
directory.
If there is another web server running on the same machine, create
another Local::Config with a different port number
and probably a different root directory.
To avoid duplication of identical parts of the configuration, the
package can be rewritten as shown in Example 5-4.
Example 5-4. Local/Config.pm
package Local::Config;
use strict;
use constant DOMAIN_NAME => 'example.com';
use constant SERVER_NAME => 'dev.' . DOMAIN_NAME;
use constant SERVER_PORT => 8000;
use constant ROOT_DIR => '/home/userfoo/www';
use constant CGI_BASE_DIR => ROOT_DIR . '/perl';
use constant DOC_BASE_DIR => ROOT_DIR . '/docs';
use constant CGI_BASE_URI => 'http://' . SERVER_NAME . ':' . SERVER_PORT
. '/perl';
use constant DOC_BASE_URI => 'http://' . SERVER_NAME . ':' . SERVER_PORT;
use constant CGI_RELATIVE_URI => '/perl';
use constant DOC_RELATIVE_URI => '';
use constant SUPPORT_EMAIL => 'stas@' . DOMAIN_NAME;
1;
Reusing constants that were previously defined reduces the risk of
making a mistake. In the original file, several lines need to be
edited if the server name is changed, but in this new version only
one line requires editing, eliminating the risk of your forgetting to
change a line further down the file. All the use
constantstatements are executed at compile time, in the
order in which they are specified. The constant
pragma ensures that any attempt to change these variables in the code
leads to an error, so they can be relied on to be correct. (Note that
in certain contexts—e.g., when they're used as
hash keys—Perl can misunderstand the use of constants. The
solution is to either prepend & or append
( ), so ROOT_DIR would become
either &ROOT_DIR or ROOT_DIR(
).)
Now, when the code needs to access the server's
global configuration, it needs to refer only to the variables in this
module. For example, in an application's
configuration file, you can create a dynamically generated
configuration, which will change from machine to machine without your
needing to touch any code (see Example 5-5).
Example 5-5. App/Foo/Config.pm
package App::Foo::Config;
use Local::Config ( );
use strict;
use vars qw($CGI_URI $CGI_DIR);
# directories and URIs of the App::Foo CGI project
$CGI_URI = $Local::Config::CGI_BASE_URI . '/App/Foo';
$CGI_DIR = $Local::Config::CGI_BASE_DIR . '/App/Foo';
1;
Notice that we used fully qualified variable names instead of
importing these global configuration variables into the
caller's namespace. This saves a few bytes of
memory, and since Local::Config will be loaded by
many modules, these savings will quickly add up. Programmers used to
programming Perl outside the mod_perl environment might be tempted to
add Perl's exporting mechanism to
Local::Config and thereby save themselves some
typing. We prefer not to use Exporter.pm under
mod_perl, because we want to save as much memory as possible. (Even
though the amount of memory overhead for using an exported name is
small, this must be multiplied by the number of concurrent users of
the code, which could be hundreds or even thousands on a busy site
and could turn a small memory overhead into a large one.)
For the staging tier, a similar
Local::Config module with just a few changes (as
shown in Example 5-6) is necessary.
Example 5-6. Local/Config.pm
package Local::Config;
use strict;
use constant DOMAIN_NAME => 'example.com';
use constant SERVER_NAME => 'stage.' . DOMAIN_NAME;
use constant SERVER_PORT => 8000;
use constant ROOT_DIR => '/home';
use constant CGI_BASE_DIR => ROOT_DIR . '/perl';
use constant DOC_BASE_DIR => ROOT_DIR . '/docs';
use constant CGI_BASE_URI => 'http://' . SERVER_NAME . ':' . SERVER_PORT
. '/perl';
use constant DOC_BASE_URI => 'http://' . SERVER_NAME . ':' . SERVER_PORT;
use constant CGI_RELATIVE_URI => '/perl';
use constant DOC_RELATIVE_URI => '';
use constant SUPPORT_EMAIL => 'stage@' . DOMAIN_NAME;
1;
We have named our staging tier machine
stage.example.com. Its root directory is
/home.
The production tier version of
Local/Config.pm is shown in Example 5-7.
Example 5-7. Local/Config.pm
package Local::Config;
use strict;
use constant DOMAIN_NAME => 'example.com';
use constant SERVER_NAME => 'www.' . DOMAIN_NAME;
use constant SERVER_PORT => 8000;
use constant ROOT_DIR => '/home/';
use constant CGI_BASE_DIR => ROOT_DIR . '/perl';
use constant DOC_BASE_DIR => ROOT_DIR . '/docs';
use constant CGI_BASE_URI => 'http://' . SERVER_NAME . ':' . SERVER_PORT
. '/perl';
use constant DOC_BASE_URI => 'http://' . SERVER_NAME . ':' . SERVER_PORT;
use constant CGI_RELATIVE_URI => '/perl';
use constant DOC_RELATIVE_URI => '';
use constant SUPPORT_EMAIL => 'support@' . DOMAIN_NAME;
You can see that the setups of the staging and production machines
are almost identical. This is only in our example; in reality, they
can be very different.
The most important point is that the Local::Config
module from a machine on one tier must never be
moved to a machine on another tier, since it will break the code. If
locally built packages are used, the Local::Config
file can simply be excluded—this will help to reduce the risk
of inadvertently copying it.
From now on, when modules and scripts are moved between machines, you
shouldn't need to worry about having to change
variables to accomodate the different machines'
server names and directory layouts. All this is accounted for by the
Local::Config files.
Some developers prefer to run conversion scripts on the moved code
that adjust all variables to the local machine. This approach is
error-prone, since variables can be written in different ways, and it
may result in incomplete adjustment and broken code. Therefore, the
conversion approach is not recommended.
5.9.1. Starting a Personal Server for Each Developer
When just one developer is
working on a specific server, there are fewer problems, because she
can have complete control over the server. However, often a group of
developers need to develop mod_perl scripts and modules concurrently
on the same machine. Each developer wants to have control over the
server: to be able to stop it, run it in single-server mode, restart
it, etc. They also want control over the location of log files,
configuration settings such as MaxClients, and so
on.
Each developer might have her own desktop machine, but all
development and staging might be done on a single central development
machine (e.g., if the developers' personal desktop
machines run a different operating system from the one running on the
development and production machines).
One workaround for this problem involves having a few versions of the
httpd.conf file (each having different
Port, ErrorLog, etc.
directives) and forcing each developer's server to
be started with:
panic% httpd_perl -f /path/to/httpd.conf
However, this means that these files must be kept synchronized when
there are global changes affecting all developers. This can be quite
difficult to manage. The solution we use is to have a single
httpd.conf file and use the
-Dparameterserver startup option to enable a
specific section of httpd.conf for each
developer. Each developer starts the server with his or her username
as an argument. As a result, a server uses both the global settings
and the developer's private settings.
For example, user stas would start his server
with:
panic% httpd_perl -Dstas
In httpd.conf, we write:
# Personal development server for stas
# stas uses the server running on port 8000
<IfDefine stas>
Port 8000
PidFile /home/httpd/httpd_perl/logs/httpd.pid.stas
ErrorLog /home/httpd/httpd_perl/logs/error_log.stas
Timeout 300
KeepAlive On
MinSpareServers 2
MaxSpareServers 2
StartServers 1
MaxClients 3
MaxRequestsPerChild 15
# let developers to add their own configuration
# so they can override the defaults
Include /home/httpd/httpd_perl/conf/stas.conf
</IfDefine>
# Personal development server for eric
# eric uses the server running on port 8001
<IfDefine eric>
Port 8001
PidFile /home/httpd/httpd_perl/logs/httpd.pid.eric
ErrorLog /home/httpd/httpd_perl/logs/error_log.eric
Timeout 300
KeepAlive Off
MinSpareServers 1
MaxSpareServers 2
StartServers 1
MaxClients 5
MaxRequestsPerChild 0
Include /home/httpd/httpd_perl/conf/eric.conf
</IfDefine>
With this technique, we have separate
error_log files and full control over server
starting and stopping, the number of child processes, and port
selection for each server. This saves Eric from having to call Stas
several times a day just to warn, "Stas,
I'm restarting the server" (a
ritual in development shops where all developers are using the same
mod_perl server).
With this technique, developers will need to learn the
PIDs of their parent
httpd_perl processes. For user
stas, this can be found in
/home/httpd/httpd_perl/logs/httpd.pid.stas. To
make things even easier, we change the apachectl
script to do the work for us. We make a copy for each developer,
called apachectl.username, and change two lines
in each script:
Now when user stas wants to stop the server, he
executes:
panic% apachectl.stas stop
And to start the server, he executes:
panic% apachectl.stas start
And so on, for all other apachectl arguments.
It might seem that we could have used just one
apachectl and have it determine for itself who
executed it by checking the UID. But the setuid bit must be enabled
on this script, because starting the server requires
root privileges. With the setuid bit set, a
single apachectlscript can be used for all
developers, but it will have to be modified to include code to read
the UID of the user executing it and to use this value when setting
developer-specific paths and variables.
The last thing you need to do is to provide developers with an option
to run in single-process mode. For example:
In addition to making the development process easier, we decided to
use relative links in all static documents, including calls to
dynamically generated documents. Since each
developer's server is running on a different port,
we have to make it possible for these relative links to reach the
correct port number.
When typing the URI by hand, it's easy. For example,
when user stas, whose server is running on port
8000, wants to access the relative URI
/test/example, he types
http://www.example.com:8000/test/example to get
the generated HTML page. Now if this document includes a link to the
relative URI /test/example2 and
stas clicks on it, the browser will
automatically generate a full request
(http://www.example.com:8000/test/example2) by
reusing the server:port combination from the
previous request.
Note that all static objects will be served from the same server as
well. This may be an acceptable situation for the development
environment, but if it is not, a slightly more complicated solution
involving the mod_rewrite Apache module will have to be devised.
To use mod_rewrite, we have to configure our
httpd_docs (light) server with
—enable-module=rewrite and recompile, or
use DSOs and load and enable the module in
httpd.conf. In the
httpd.conf file of our
httpd_docsserver, we have the following code:
RewriteEngine on
# stas's server
# port = 8000
RewriteCond %{REQUEST_URI} ^/perl
RewriteCond %{REMOTE_ADDR} 123.34.45.56
RewriteRule ^(.*) http://example.com:8000/$1 [P,L]
# eric's server
# port = 8001
RewriteCond %{REQUEST_URI} ^/perl
RewriteCond %{REMOTE_ADDR} 123.34.45.57
RewriteRule ^(.*) http://example.com:8001/$1 [P,L]
# all the rest
RewriteCond %{REQUEST_URI} ^/perl
RewriteRule ^(.*) http://example.com:81/$1 [P]
The IP addresses are those of the developer desktop machines (i.e.,
where they run their web browsers). If an HTML file includes a
relative URI such as /perl/test.pl or even
http://www.example.com/perl/test.pl, requests
for those URIs from user stas's
machine will be internally proxied to
http://www.example.com:8000/perl/test.pl, and
requests generated from user
eric's machine will be proxied
to http://www.example.com:8001/perl/test.pl.
Another possibility is to use the REMOTE_USER
variable. This requires that all developers be authenticated when
they access the server. To do so, change the
RewriteRules to match
REMOTE_USER in the above example.
Remember, the above setup will work only with relative URIs in the
HTML code. If the HTML output by the code uses full URIs including a
port other than 80, the requests originating from this HTML code will
bypass the light server listening on the default port 80 and go
directly to the server and port of the full URI.