Performance tunning with PEAR::DB

January 16th, 2007. Tagged: mdb2, PEAR, php

If you use PEAR::MDB2, you can set a custom debug handler and collect all the queries you execute for debugging and performance tunning purposes, as shown before. But what if you're using PEAR::DB? Well, since PEAR::DB doesn't allow you such a functionality out of the box, you can hack it a bit to get similar results.

Simple app

Let's say you have a simple app:

<?php 
require_once 'DB.php';
 
$dsn = 'mysql://root@localhost/test';
$db =& DB::connect($dsn);
$db->setFetchMode(DB_FETCHMODE_ASSOC);
 
$sql = 'SELECT * FROM zipcodes';
$result = $db->getAll($sql);
$result = $db->getOne($sql);
$result = $db->getCol($sql);
$result = $db->getAll($sql);
$sql = 'SELECT zipcode FROM zipcodes';
$result = $db->getAll($sql);
$result = $db->getAll($sql);
$sql = 'SELECT CONCAT(zipcode, " - ", city) FROM zipcodes';
$result = $db->getAll($sql);
?>

Of course, this is an oversimplified example, usually you have more included files, class libraries and such, and it's not difficult to lose track of the database work as your app grows in complexity and size.

Now let's debug this app to see what type of database work it does.

Hacking PEAR::DB

In my case, I'm using MySQL, so I need to find the DB/mysql.php file in my PEAR directory. I open that file and find the simpleQuery() method. That's where all queries and up, sooner or later. I find this piece of code:

<?php
if (!$this->options['result_buffering']) {
    $result = @mysql_unbuffered_query($query, $this->connection);
} else {
    $result = @mysql_query($query, $this->connection);
}
?>

Then I hack this piece of code, adidng some lines before and after it. The result:

<?php
// start
$start_time = array_sum(explode(' ',microtime()));
// end
 
if (!$this->options['result_buffering']) {
    $result = @mysql_unbuffered_query($query, $this->connection);
} else {
    $result = @mysql_query($query, $this->connection);
}
 
// start
$query_took = array_sum(explode(' ',microtime())) - $start_time;
@$GLOBALS['global_query_counter']++;
@$GLOBALS['all_the_queries'][$GLOBALS['global_query_counter'] . ' - ' . $query] = $query_took;
//end
?>

Now as my app's pages are executed, I'll collect invaluable DB information.

Reporting

Let's see what we've collected.

You can add different types of reports in the footer of your application, or better yet, you can register a shutdown function to do the same. Here are some reporting ideas:

<?php
// report 1.
echo "<pre>All the queries, by the order they are executed:\\n";
print_r($GLOBALS['all_the_queries']);
echo '</pre>';
 
// report 2.
echo "<pre>All the queries, ordered by the time they took, descending:\\n";
arsort($GLOBALS['all_the_queries']);
print_r($GLOBALS['all_the_queries']);
echo '</pre>';
 
// report 3.
$sum = 0;
foreach ($GLOBALS['all_the_queries'] AS $t) {
    $sum += $t;
}
echo '<pre>';
echo 'Total number of queries:   ' . $GLOBALS['global_query_counter'] . "\\n";
echo 'Total time spend querying: ' . $sum;
echo '</pre>';
 
 
// report 4.
$distinct = array();
foreach ($GLOBALS['all_the_queries'] AS $q=>$t) {
    $parts = explode(' - ', $q);
    unset($parts[0]);
    $query = implode(' - ', $parts);
    @$distinct[$query]++;
}
echo "<pre>How many duplications:\\n";
arsort($distinct);
print_r($distinct);
echo '</pre>';
?>

Report results

Let's see what these reports gives us.

All the queries, by the order they are executed:
Array
(
    [1 - SELECT * FROM zipcodes] => 0.00626707077026
    [2 - SELECT * FROM zipcodes] => 0.00730204582214
    [3 - SELECT * FROM zipcodes] => 0.00796985626221
    [4 - SELECT * FROM zipcodes] => 0.00654602050781
    [5 - SELECT zipcode FROM zipcodes] => 0.0058650970459
    [6 - SELECT zipcode FROM zipcodes] => 0.0239379405975
    [7 - SELECT CONCAT(zipcode, " - ", city) FROM zipcodes] => 0.00581502914429
)

All the queries, ordered by the time they took, descending:
Array
(
    [6 - SELECT zipcode FROM zipcodes] => 0.0239379405975
    [3 - SELECT * FROM zipcodes] => 0.00796985626221
    [2 - SELECT * FROM zipcodes] => 0.00730204582214
    [4 - SELECT * FROM zipcodes] => 0.00654602050781
    [1 - SELECT * FROM zipcodes] => 0.00626707077026
    [5 - SELECT zipcode FROM zipcodes] => 0.0058650970459
    [7 - SELECT CONCAT(zipcode, " - ", city) FROM zipcodes] => 0.00581502914429
)

Total number of queries:   7
Total time spend querying: 0.0637030601501

How many duplications:
Array
(
    [SELECT * FROM zipcodes] => 4
    [SELECT zipcode FROM zipcodes] => 2
    [SELECT CONCAT(zipcode, " - ", city) FROM zipcodes] => 1
)

Thanks for reading!

Any comments or suggestions are very welcome!

Comments? Find me on BlueSky, Mastodon, LinkedIn, Threads, Twitter