Sunday, April 06, 2008

PEAR and Gearman

PEAR might be getting a Gearman package.

What is Gearman? It's a quick and easy way to do distributed processing, in use at Yahoo, Digg, and presumably, Livejournal.

There is a server, gearmand, and you run several PHP (or other language) clients. You describe the tasks each client can perform, and gearman helps distribute the work.

An example client:

require_once 'Net/Gearman/Worker.php';

try {
$worker = new Net_Gearman_Worker(array('dev01:7003', 'dev01:7004'));
$worker->addAbility('Hello');
$worker->addAbility('Fail');
$worker->addAbility('SQL');
$worker->beginWork();
} catch (Net_Gearman_Exception $e) {
echo $e->getMessage() . "\n";
exit;
}


An example task, to do SQL queries.

require_once 'DB.php';

class Net_Gearman_Job_SQL extends Net_Gearman_Job_Common
{
public function run(stdClass $arg)
{
if (!isset($arg->sql) || !strlen($arg->sql)) {
throw new Net_Gearman_Job_Exception;
}

$db = DB::connect('mysql://testing:testing@192.168.243.20/testing');
$db->setFetchMode(DB_FETCHMODE_ASSOC);
$res = $db->getAll($arg->sql);
return $res;
}
}


Putting it all together

require_once 'Net/Gearman/Client.php';

$set = new Net_Gearman_Set();

function result($resp) {
print_r($resp);
}

$sql = array(
"SELECT * FROM users WHERE username = 'joestump'",
"SELECT * FROM users WHERE username LIKE 'joe%' LIMIT 10",
"SELECT * FROM items WHERE deleted = 0 LIMIT 10"
);

foreach ($sql as $s) {
$task = new Net_Gearman_Task('SQL', array(
'sql' => $s
));

$task->attachCallback('result');
$set->addTask($task);
}

$client = new Net_Gearman_Client(array('dev01'));
$client->runSet($set);


Some use cases which immediately spring to mind for me include:
* Running distributed PHPUnit runs against a common revision of the code
* Running our document generator processes
* Running our webservice/queuing processes
* As part of a scutter (multiple documents to retrieve)

No comments: