Saturday, November 13, 2004

DOAML and PHPList hack

PHPList isn't great software. It's working software. I don't like it because it's not object oriented. Harder to reuse the code - however, easier to hack up. Less stability, but what can you do.

In this hack, we will be adding DOAML information to PHPList - RDF/XML output about the who's who of mailing lists.

Start: 21:40
Grab the latest PHPList. At this time, it's 2.9.3 - which means I'll have to upgrade the installation on the server. Waiting, waiting, backing up, uploading. Ick. So, I'm here to blog about the process.

What we have to do.
  1. Identify the points in the code for database abstraction. There are none at this time really.
  2. Query the database for all mailing lists
  3. Express mailing list information in RDF.
So... Step 1.

We grab the index.php file under /phplist/public_html/lists/ and rip all of the code right out. We cull what we don't need.
Bye bye function()s. You seem only to deal with the forms component of this interface.

11 minutes in. This looks bad, I'm still upgrading old code. I've poked about and removed all but what appears to be needed to get title and description information from the database - untested so far.

Ah hah. We're on. Next, we ditch the page headers and page footers. Extra HTML. Blech. Say byebye to any instance of $data["header"] and $data["footer"].

The fragment we are hunting is "Unsubscribe from our Mailinglists". Get that and we find where all the information is output. Unfortunately, Internationalization sucks for hunting through code sometimes. The URL however, is "?p=unsubscribe". A "find" for that quickly locates the data. Around line 211. Once you've butchered all of the functions located beneath this, it's the last line of code. Hurrah.

Now what do we have? Simple, plain HTML spitting out all of the lists. Snippy snip time some more. Cull all of this crud that handles sessions. I don't know about you but scutters don't give a stuff about cookies. Or logging in. I don't know about you, but I'm down to about 100 lines of php.


<?php
//header("content-type: text/plain");

ob_start();
$er = error_reporting(0); # some ppl have warnings on
if (isset($_SERVER["ConfigFile"]) && is_file($_SERVER["ConfigFile"])) {
print '<!-- using '.$_SERVER["ConfigFile"].'-->'."\n";
include $_SERVER["ConfigFile"];
} elseif (isset($_ENV["CONFIG"]) && is_file($_ENV["CONFIG"])) {
print '<!-- using '.$_ENV["CONFIG"].'-->'."\n";
include $_ENV["CONFIG"];
} elseif (is_file("config/config.php")) {
print '<!-- using config/config.php -->'."\n";
include "config/config.php";
} else {
print "Error, cannot find config file\n";
exit;
}
if (isset($GLOBALS["developer_email"])) {
error_reporting(E_ALL);
} else {
error_reporting($er);
}
require_once dirname(__FILE__).'/admin/'.$GLOBALS["database_module"];
require_once dirname(__FILE__)."/texts/english.inc";
include_once dirname(__FILE__)."/texts/".$GLOBALS["language_module"];
require_once dirname(__FILE__)."/admin/defaultconfig.inc";
require_once dirname(__FILE__).'/admin/connect.php';
include_once dirname(__FILE__)."/admin/languages.php";

if (!isset($_POST) && isset($HTTP_POST_VARS)) {
require "admin/commonlib/lib/oldphp_vars.php";
}

$id = sprintf('%d',$_GET["id"]);

if ($_GET["uid"]) {
$req = Sql_Fetch_Row_Query(sprintf('select subscribepage,id,password,email from %s where uniqid = "%s"',
$tables["user"],$_GET["uid"]));
$id = $req[0];
$userid = $req[1];
$userpassword = $req[2];
$emailcheck = $req[3];
} elseif ($_GET["email"]) {
$req = Sql_Fetch_Row_Query(sprintf('select subscribepage,id,password,email from %s where email = "%s"',
$tables["user"],$_GET["email"]));
$id = $req[0];
$userid = $req[1];
$userpassword = $req[2];
$emailcheck = $req[3];
} else {
$userid = "";
$userpassword = "";
$emailcheck = "";
}
# make sure the subscribe page still exists
$req = Sql_fetch_row_query(sprintf('select id from %s where id = %d',$tables["subscribepage"],$id));
$id = $req[0];
$msg = "";


if (!$id) {
# find the default one:
$id = getConfig("defaultsubscribepage");
# fix the true/false issue
if ($id == "true") $id = 1;
if ($id == "false") $id = 0;
if (!$id) {
# pick a first
$req = Sql_Fetch_row_Query(sprintf('select ID from %s where active',$tables["subscribepage"]));
$id = $req[0];
}
}


if ($login_required && !$_SESSION["userloggedin"] && !$canlogin) {
print LoginPage($id,$userid,$emailcheck,$msg);
} elseif (preg_match("/(\w+)/",$_GET["p"],$regs)) {
if ($id) {
} else {
FileNotFound();
}
} else {
if ($id) $data = PageData($id);
print '<title>'.$GLOBALS["strSubscribeTitle"].'</title>';

$req = Sql_Query(sprintf('select * from %s where active',$tables["subscribepage"]));
if (Sql_Affected_Rows()) {
while ($row = Sql_Fetch_Array($req)) {
$intro = Sql_Fetch_Row_Query(sprintf('select data from %s where id = %d and name = "intro"',$tables["subscribepage_data"],$row["id"]));
print $intro[0];
printf('<p><a href="./?p=subscribe&id=%d">%s</a></p>',$row["id"],$row["title"]);
}
} else {
printf('<p><a href="./?p=subscribe">%s</a></p>',$strSubscribeTitle);
}

printf('<p><a href="./?p=unsubscribe">%s</a></p>',$strUnsubscribeTitle);
}
?>

More hacking. We've gotten our very basic information. Time to change it from HTML to RDF/XML. We also want to pay attention to '$row["title"]' and '$intro[0]' - doaml:title and doaml:description :)

Ok, I can't be stuffed explaining as I hack any more, let's just paste you some code. Notice, we've replaced whereever HTML is output with RDF/XML. We've added in all of the relevant headers, and changed a few things to make it valid XML. And we are now serving it up as text/xml.


<?php
header("content-type: text/xml");
print '<' . '?xml version="1.0" encoding="iso-8859-1"?' . '>' . "\n";
?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dct="http://purl.org/dc/terms/"
xmlns:doaml="http://ns.balbinus.net/doaml#">
<?php
ob_start();
$er = error_reporting(0); # some ppl have warnings on
if (isset($_SERVER["ConfigFile"]) && is_file($_SERVER["ConfigFile"])) {
print '<!-- using '.$_SERVER["ConfigFile"].'-->'."\n";
include $_SERVER["ConfigFile"];
} elseif (isset($_ENV["CONFIG"]) && is_file($_ENV["CONFIG"])) {
print '<!-- using '.$_ENV["CONFIG"].'-->'."\n";
include $_ENV["CONFIG"];
} elseif (is_file("config/config.php")) {
print '<!-- using config/config.php -->'."\n";
include "config/config.php";
} else {
print "Error, cannot find config file\n";
exit;
}
if (isset($GLOBALS["developer_email"])) {
error_reporting(E_ALL);
} else {
error_reporting($er);
}
require_once dirname(__FILE__).'/admin/'.$GLOBALS["database_module"];
require_once dirname(__FILE__)."/texts/english.inc";
include_once dirname(__FILE__)."/texts/".$GLOBALS["language_module"];
require_once dirname(__FILE__)."/admin/defaultconfig.inc";
require_once dirname(__FILE__).'/admin/connect.php';
include_once dirname(__FILE__)."/admin/languages.php";

if (!isset($_POST) && isset($HTTP_POST_VARS)) {
require "admin/commonlib/lib/oldphp_vars.php";
}

$id = sprintf('%d',$_GET["id"]);

if ($_GET["uid"]) {
$req = Sql_Fetch_Row_Query(sprintf('select subscribepage,id,password,email from %s where uniqid = "%s"',
$tables["user"],$_GET["uid"]));
$id = $req[0];
$userid = $req[1];
$userpassword = $req[2];
$emailcheck = $req[3];
} elseif ($_GET["email"]) {
$req = Sql_Fetch_Row_Query(sprintf('select subscribepage,id,password,email from %s where email = "%s"',
$tables["user"],$_GET["email"]));
$id = $req[0];
$userid = $req[1];
$userpassword = $req[2];
$emailcheck = $req[3];
} else {
$userid = "";
$userpassword = "";
$emailcheck = "";
}
# make sure the subscribe page still exists
$req = Sql_fetch_row_query(sprintf('select id from %s where id = %d',$tables["subscribepage"],$id));
$id = $req[0];
$msg = "";


if (!$id) {
# find the default one:
$id = getConfig("defaultsubscribepage");
# fix the true/false issue
if ($id == "true") $id = 1;
if ($id == "false") $id = 0;
if (!$id) {
# pick a first
$req = Sql_Fetch_row_Query(sprintf('select ID from %s where active',$tables["subscribepage"]));
$id = $req[0];
}
}


if ($login_required && !$_SESSION["userloggedin"] && !$canlogin) {
print LoginPage($id,$userid,$emailcheck,$msg);
} elseif (preg_match("/(\w+)/",$_GET["p"],$regs)) {
if ($id) {
} else {
FileNotFound();
}
} else {
if ($id) $data = PageData($id);
$req = Sql_Query(sprintf('select * from %s where active',$tables["subscribepage"]));
if (Sql_Affected_Rows()) {
while ($row = Sql_Fetch_Array($req)) {
$intro = Sql_Fetch_Row_Query(sprintf('select data from %s where id = %d and name = "intro"',$tables["subscribepage_data"],$row["id"]));

print '<doaml:Newsletter rdf:type="http://ns.balbinus.net/doaml#MemberOnlyNewsletter">' . "\n";

printf(' <doaml:description-page rdf:resource="./?p=subscribe&id=%d" />' . "\n",$row["id"]);
print ' <doaml:name>' . $row["title"] . '</doaml:name>' . "\n";
print ' <doaml:topic>' . $row["title"] . '</doaml:topic>' . "\n";
print ' <doaml:description>' . $intro[0] . '</doaml:description>' . "\n";
printf(' <rdfs:seeAlso><rdf:Description rdf:about="./?p=subscribe&id=%d"><dc:title>%s</dc:title></rdf:Description></rdfs:seeAlso>' . "\n",$row["id"],$row["title"]);

print '</doaml:Newsletter>' . "\n";

}
} else {
print '<doaml:Newsletter rdf:type="http://ns.balbinus.net/doaml#MemberOnlyNewsletter">' . "\n";
print ' <doaml:description-page rdf:resource="./?p=subscribe" />' . "\n";
print ' <doaml:name>' . $strSubscribeTitle . '</doaml:name>' . "\n";
printf(' <rdfs:seeAlso><rdf:Description rdf:about="./?p=subscribe"><dc:title>%s</dc:title></rdf:Description></rdfs:seeAlso>' . "\n",$strSubscribeTitle);
print '</doaml:Newsletter>' . "\n";
}

}
?>
</rdf:RDF>
<!--
<doaml:requests rdf:resource="mailto:doaml-interest-request@lists.sourceforge.net" />
<doaml:mbox rdf:resource="mailto:doaml-interest@lists.sourceforge.net" />

<doaml:moderator rdf:nodeID="vincent" />

<doaml:name>DOAML-interest</doaml:name>
<doaml:description>Discussions about DOAML (Description Of A Mailing List)</doaml:description>
<doaml:created>2004-11-10</doaml:created>
<doaml:creator rdf:nodeID="vincent" />
<doaml:topic rdf:resource="http://www.doaml.net/" />

<foaf:Person rdf:nodeID="vincent">
<foaf:name>Vincent Tabard</foaf:name>
<foaf:mbox_sha1sum>ef755f7a687f4a443e47295cc1b3ac3b8c935037</foaf:mbox_sha1sum>
<foaf:homepage rdf:resource="http://www.balbinus.net/" />
<rdfs:seeAlso rdf:resource="http://foaf.balbinus.net/" />
</foaf:Person>
-->

Now we have to peel back one more layer of the orange. We've had a little peek at the code of users.php and felt kind of queasy - this, kids, is why Object Oriented programming is good. You take the mess out of everything. It would take us minutes, if not hours, to follow the convoluted code here only to realise we are looking in the wrong place. Screw it. Directly accessing the database time!


<?php
$req = Sql_Query(sprintf('select * from %s',$tables["user"]));
if (Sql_Affected_Rows()) {
while ($row = Sql_Fetch_Array($req)) {
print_r($row);
}
}
?>


Grab all of the users. All of the information. Blech, it's too much. Grab only the relevant fields - id, email, uniqid. Spit out a sha1 of the mbox for each user, so we have a FOAF IFP (Inverse Functional Property).


<?php
$req = Sql_Query(sprintf('select id, email, uniqid from %s',$tables["user"]));
if (Sql_Affected_Rows()) {
while ($row = Sql_Fetch_Array($req)) {
//print_r($row);
print sha1('mailto:' . $row['email']) . "\n";
//print sha1($row['email']) . "\n";
}
}
?>
Mmm, tasty. Let's step one little bit further.


<?php
$req = Sql_Query(sprintf('select id, email, uniqid from %s',$tables["user"]));
if (Sql_Affected_Rows()) {
while ($row = Sql_Fetch_Array($req)) {
//print_r($row);
print sha1('mailto:' . $row['email']) . "\n";
print $row['id'] . "\n";
//print sha1($row['email']) . "\n";
$query = Sql_Query(sprintf("select listid, userid from %s WHERE userid = '" . $row["id"] . "'",$tables["listuser"]));
if (Sql_Affected_Rows()) {
while ($rows = Sql_Fetch_Array($query)) {
print_r($rows);
}
}
}
}
?>
So now, we are spitting out a list of every user, and all of the lists they are subscribed to. Useful, we'll comment that out however and save it for when we wish to examine one person in particular by their sha1 hash.

So what's next? Take the above code, and reverse the order of the queries - select all users subscribed to a particular list.

Clean it all up, stick it into functions.

Ah, screw it. I'm already finished.

No comments: