Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Mail Systems
Eclipse Documentation

How To Guides
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Problem Solutions
Privacy Policy




20.2. Improving Performance

Let's now talk about various techniques that allow you to boost the speed of applications that work with relational databases. A whole book could be devoted to this topic, so here we will concentrate on the techniques that apply specifically to mod_perl servers.

20.2.3. Efficient Record-Retrieval Techniques

When working with a relational database, you'll often encounter the need to read the retrieved set of records into your program, then format and print them to the browser.

Assuming that you're already connected to the database, let's consider the following code prototype:

my $query = "SELECT id,fname,lname FROM test WHERE id < 10";
my $sth = $dbh->prepare($query);

my @results = ( );
while (my @row_ary  = $sth->fetchrow_array) {
    push @results, [ transform(@row_ary) ];
# print the output using the the data returned from the DB

In this example, the httpd process will grow by the size of the variables that have been allocated for the records that matched the query. Remember that to get the total amount of extra memory required by this technique, this growth should be multiplied by the number of child processes that your server runs—which is probably not a constant.

A better approach is not to accumulate the records, but rather to print them as they are fetched from the DB. You can use the methods $sth->bind_columns( ) and $sth->fetchrow_arrayref( ) (aliased to $sth->fetch( )) to fetch the data in the fastest possible way. Example 20-1 prints an HTML table with matched data. Now the only additional memory consumed is for an @cols array to hold temporary row values.

Example 20-1.

my $query = "SELECT id,fname,lname FROM test WHERE id < 10";
my @fields = qw(id fname lname);

# create a list of cols values
my @cols = ( );
@cols[0..$#fields] = ( );
$sth = $dbh->prepare($query);

# Bind perl variables to columns.
$sth->bind_columns(undef, \(@cols));
print "<table>";
print '<tr bgcolor="grey">', 
    map("<th>$_</th>", @fields), "</tr>";
while ($sth->fetch) {
    print "<tr>", 
        map("<td>$_</td>", @cols), "</tr>";
print "</table>";

Note that this approach doesn't tell you how many records have been matched. The workaround is to run an identical query before the code above, using SELECT count(*)... instead of SELECT * ... to get the number of matched records:

my $query = "SELECT count(*) FROM test WHERE id < 10";

This should be much faster, since you can remove any SORT BY and similar attributes.

You might think that the DBI method $sth->rows will tell you how many records will be returned, but unfortunately it will not. You can rely on a row count only after a do (for some specific operations, such as update and delete), after a non-select execute, or after fetching all the rows of a selectstatement.

For selectstatements, it is generally not possible to know how many rows will be returned except by fetching them all. Some DBD drivers will return the number of rows the application has fetched so far, but others may return -1 until all rows have been fetched. Thus, use of the rows method with select statements is not recommended.

20.2.4. mysql_use_result Versus mysql_store_result Attributes

Many mod_perl developers use MySQL as their preferred relational database server because of its speed. Depending on the situation, it may be possible to change the way in which the DBD::mysql driver delivers data. The two attributes mysql_use_result and mysql_store_result influence the speed and size of the processes.

You can tell the DBD::mysql driver to change the default behavior before you start to fetch the results:

my $sth = $dbh->prepare($query);
$sth->{"mysql_use_result"} = 1;

This forces the driver to use mysql_use_result rather than mysql_store_result. The former is faster and uses less memory, but it tends to block other processes, which is why mysql_store_result is the default.

Think about it in client/server terms. When you ask the server to spoon-feed you the data as you use it, the server process must buffer the data, tie up that thread, and possibly keep database locks open for a long time. So if you read a row of data and ponder it for a while, the tables you have locked are still locked, and the server is busy talking to you every so often. That is the situation with mysql_use_result.

On the other hand, if you just suck down the whole data set to the client, then the server is free to serve other requests. This improves parallelism, since rather than blocking each other by doing frequent I/O, the server and client are working at the same time. That is the situation with mysql_store_result.

As the MySQL manual suggests, you should not use mysql_use_result if you are doing a lot of processing for each row on the client side. This can tie up the server and prevent other threads from updating the tables.

If you are using some other DBD driver, check its documentation to see if it provides the flexibility of DBD::mysql in this regard.

Copyright © 2003 O'Reilly & Associates. All rights reserved.

  Published courtesy of O'Reilly Design by Interspire