To be able to improve the performance of your system you need a prior
understanding of what can be improved, how it can be improved, how
much it can be improved, and, most importantly, what impact the
improvement will have on the overall performance of your system. You
need to be able to identify those things that, after you have done
your best to improve them, will yield substantial benefits for the
overall system performance. Concentrate your efforts on them, and
avoid wasting time on improvements that give little overall gain.
If you have a small application it may be possible to detect places
that could be improved simply by inspecting the code. On the other
hand, if you have a large application, or many applications,
it's usually impossible to do the detective work
with the naked eye. You need observation instruments and measurement
tools. These belong to the benchmarking and code-profiling
categories.
It's important to understand that in the majority of
the benchmarking tests
that we will execute, we will not be looking at absolute results. Few
machines will have exactly the same hardware and software setup, so
this kind of comparison would usually be misleading, and in most
cases we will be trying to show which coding approach is preferable,
so the hardware is almost irrelevant.
Rather than looking at absolute results, we will be looking at the
differences between two or more result sets run on the same machine.
This is what you should do; you shouldn't try to
compare the absolute results collected here with the results of those
same benchmarks on your own machines.
In this chapter we will present a few existing tools that are widely
used; we will apply them to example code snippets to show you how
performance can be measured, monitored, and improved; and we will
give you an idea of how you can develop your own tools.
9.1. Server Benchmarking
As web service developers, the most important thing we should strive
for is to offer the user a fast, trouble-free browsing experience.
Measuring the response rates of our servers under a variety of load
conditions and benchmark programs helps us to do this.
A benchmark program may consume significant
resources, so you cannot find the real times that a typical user will
wait for a response from your service by running the benchmark on the
server itself. Ideally you should run it from a different machine. A
benchmark program is unlike a typical user in the way it generates
requests. It should be able to emulate multiple concurrent users
connecting to the server by generating many concurrent requests. We
want to be able to tell the benchmark program what load we want to
emulate—for example, by specifying the number or rate of
requests to be made, the number of concurrent users to emulate, lists
of URLs to request, and other relevant arguments.
9.1.1. ApacheBench
ApacheBench
(ab) is a tool for benchmarking your Apache HTTP
server. It is designed to give you an idea of the performance that
your current Apache installation can give. In particular, it shows
you how many requests per second your Apache server is capable of
serving. The ab tool comes bundled with the
Apache source distribution, and like the Apache web server itself,
it's free.
Let's try it. First we create a test script, as
shown in Example 9-1.
Example 9-1. simple_test.pl
my $r = shift;
$r->send_http_header('text/plain');
print "Hello\n";
We will simulate 10 users concurrently requesting the file
simple_test.pl through
http://localhost/perl/simple_test.pl. Each
simulated user makes 500 requests. We generate 5,000 requests in
total:
panic% ./ab -n 5000 -c 10 http://localhost/perl/simple_test.pl
Server Software: Apache/1.3.25-dev
Server Hostname: localhost
Server Port: 8000
Document Path: /perl/simple_test.pl
Document Length: 6 bytes
Concurrency Level: 10
Time taken for tests: 5.843 seconds
Complete requests: 5000
Failed requests: 0
Broken pipe errors: 0
Total transferred: 810162 bytes
HTML transferred: 30006 bytes
Requests per second: 855.72 [#/sec] (mean)
Time per request: 11.69 [ms] (mean)
Time per request: 1.17 [ms] (mean, across all concurrent requests)
Transfer rate: 138.66 [Kbytes/sec] received
Connnection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 1.4 0 17
Processing: 1 10 12.9 7 208
Waiting: 0 9 13.0 7 208
Total: 1 11 13.1 8 208
Most of the report is not very interesting to us. What we really care
about are the Requests per second and
Connection Times results:
Requests per second
The number of requests (to our test script) the server was able to
serve in one second
Connect and Waiting times
The amount of time it took to establish the connection and get the
first bits of a response
Processing time
The server response time—i.e., the time it took for the server
to process the request and send a reply
Total time
The sum of the Connect and Processing times
As you can see, the server was able to respond on average to 856
requests per second. On average, it took no time to establish a
connection to the server both the client and the server are running
on the same machine and 10 milliseconds to process each request. As
the code becomes more complicated you will see that the processing
time grows while the connection time remains constant. The latter
isn't influenced by the code complexity, so when you
are working on your code performance, you care only about the
processing time. When you are benchmarking the overall service, you
are interested in both.
Just for fun, let's benchmark a similar script,
shown in Example 9-2, under mod_cgi.
Requests per second: 156.40 [#/sec] (mean)
Time per request: 63.94 [ms] (mean)
Now, when essentially the same script is executed under mod_cgi
instead of mod_perl, we get 156 requests per second responded to, not
856.
ApacheBench can generate KeepAlives,
GET (default) and POST
requests, use Basic Authentication, send cookies
and custom HTTP headers. The version of
ApacheBench released with Apache version 1.3.20
adds SSL support, generates gnuplot and CSV
output for postprocessing, and reports median and standard deviation
values.
HTTPD::Bench::ApacheBench, available
from CPAN, provides a Perl interface
for ab.
9.1.2. httperf
httperf
is another tool for measuring web server performance. Its input and
reports are different from the ones we saw while using
ApacheBench. This tool's
manpage includes an in-depth explanation of all the options it
accepts and the results it generates. Here we will concentrate on the
input and on the part of the output that is most interesting to us.
With httperf you cannot specify the concurrency
level; instead, you have to specify the connection opening rate
(—rate) and the number of calls
(—num-call) to perform on each opened
connection. To compare the results we received from
ApacheBench we will use a connection rate
slightly higher than the number of requests responded to per second
reported by ApacheBench. That number was 856, so
we will try a rate of 860 (—rate 860) with
just one request per connection (—num-call
1). As in the previous test, we are going to make 5,000
requests (—num-conn 5000). We have set a
timeout of 60 seconds and allowed httperf to use
as many ports as it needs (—hog).
So let's execute the benchmark and analyze the
results:
panic% httperf --server localhost --port 80 --uri /perl/simple_test.pl \
--hog --rate 860 --num-conn 5000 --num-call 1 --timeout 60
Maximum connect burst length: 11
Total: connections 5000 requests 5000 replies 5000 test-duration 5.854 s
Connection rate: 854.1 conn/s (1.2 ms/conn, <=50 concurrent connections)
Connection time [ms]: min 0.8 avg 23.5 max 226.9 median 20.5 stddev 13.7
Connection time [ms]: connect 4.0
Connection length [replies/conn]: 1.000
Request rate: 854.1 req/s (1.2 ms/req)
Request size [B]: 79.0
Reply rate [replies/s]: min 855.6 avg 855.6 max 855.6 stddev 0.0 (1 samples)
Reply time [ms]: response 19.5 transfer 0.0
Reply size [B]: header 184.0 content 6.0 footer 2.0 (total 192.0)
Reply status: 1xx=0 2xx=5000 3xx=0 4xx=0 5xx=0
CPU time [s]: user 0.33 system 1.53 (user 5.6% system 26.1% total 31.8%)
Net I/O: 224.4 KB/s (1.8*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
As before, we are mostly interested in the average Reply
rate—855, almost exactly the same result reported by
ab in the previous section. Notice that when we
tried —rate 900 for this particular setup,
the reported request rate went down drastically, since the
server's performance gets worse when there are more
requests than it can handle.
9.1.3. http_load
http_loadis
yet another utility that does web server load testing. It can
simulate a 33.6 Kbps modem connection
(-throttle) and allows you to provide a file
with a list of URLs that will be fetched randomly. You can specify
how many parallel connections to run (-parallel
N) and the number of requests to generate per second
(-rate N). Finally, you can tell the utility
when to stop by specifying either the test time length
(-seconds N) or the total number of fetches
(-fetches N).
Again, we will try to verify the results reported by
ab (claiming that the script under test can
handle about 855 requests per second on our machine). Therefore we
run http_load with a rate of 860 requests per
second, for 5 seconds in total. We invoke is on the file
urls, containing a single URL:
http://localhost/perl/simple_test.pl
Here is the generated output:
panic% http_load -rate 860 -seconds 5 urls
4278 fetches, 325 max parallel, 25668 bytes, in 5.00351 seconds
6 mean bytes/connection
855 fetches/sec, 5130 bytes/sec
msecs/connect: 20.0881 mean, 3006.54 max, 0.099 min
msecs/first-response: 51.3568 mean, 342.488 max, 1.423 min
HTTP response codes:
code 200 -- 4278
This application also reports almost exactly the same response-rate
capability: 855 requests per second. Of course, you may think that
it's because we have specified a rate close to this
number. But no, if we try the same test with a higher rate:
panic% http_load -rate 870 -seconds 5 urls
4045 fetches, 254 max parallel, 24270 bytes, in 5.00735 seconds
6 mean bytes/connection
807.813 fetches/sec, 4846.88 bytes/sec
msecs/connect: 78.4026 mean, 3005.08 max, 0.102 min
we can see that the performance goes down—it reports a response
rate of only 808 requests per second.
The nice thing about this utility is that you can list a few URLs to
test. The URLs that get fetched are chosen randomly from the
specified file.
Note that when you provide a file with a list of URLs, you must make
sure that you don't have empty lines in it. If you
do, the utility will fail and complain:
./http_load: unknown protocol -
9.1.4. Other Web Server Benchmark Utilities
The following are also interesting benchmarking applications
implemented in Perl:
HTTP::WebTest
The HTTP::WebTest
module (available from CPAN) runs tests on remote URLs or local web
files containing Perl, JSP, HTML, JavaScript, etc. and generates a
detailed test report.
HTTP::Monkeywrench
HTTP::Monkeywrenchis
a test-harness application to test the integrity of a
user's path through a web site.
Apache::Recorder and HTTP::RecordedSession
Apache::Recorder
(available from CPAN) is a mod_perl handler that records an HTTP
session and stores it on the web server's
filesystem.
HTTP::RecordedSession
reads the recorded session from the filesystem and formats it for
playback using HTTP::WebTest or
HTTP::Monkeywrench. This is useful when writing
acceptance and regression tests.
Many other benchmark utilities are available both for free and for
money. If you find that none of these suits your needs,
it's quite easy to roll your own utility. The
easiest way to do this is to write a Perl script that uses the
LWP::Parallel::UserAgent and
Time::HiRes modules. The former module allows you
to open many parallel connections and the latter allows you to take
time samples with microsecond resolution.