Sunday, November 27, 2011

I'm addicted to Wholesale Meat

If there's one thing I like; it's meat. If there's another I like: bulk. Gawler River Cattle Co seems to provide both, happily.

That is all

Sunday, November 20, 2011

I need a new lawnmower

Right now, I have an out of control back yard and a dire need for a lawnmower. What kind of thing do people suggest?

Sunday, November 13, 2011

Why is Gwibber so slow?

Gwibber seems horribly slow and CPU intensive. From what I've read, there is much finger pointing and little actual solution - just vague hand waving about desktopcouch being a terrible backend.

Development seems dead in the water. How have we ended up at this point? Why don't I have ways to integrate TweetDeck or any other client as well? Why does gwibber-service need to run all of the time?


Managing multiple job configurations for Jenkins

If you are in the same boat as I am, you find you have too many packages to look after with Jenkins.

The beauty of Jenkins is the simplicity at setting up a job with the web frontend - but once you get over a certain level of complexity this is actually one of the bigger drawbacks.

Sure, we've got some templates, but how far can you really stretch it?

In my situation, I need to:

  1. Trawl SVN/other version control for all packages available - several hundred
  2. Only if the package has tests, add an entry to the CI suite
  3. Adapt to packages which require E_ALL & ~E_STRICT to run happily under that
  4. Packages which require dependencies, but can't be installed, still need a mechanism to install said dependencies
  5. And some which need to be invoked with the legacy AllTests.php
  6. Detect when a package has migrated to github
  7. ... and update an existing build/job with a new tool when required
I had tackled part 1 with pear's "packages-all" SVN link, which pointed to the trunk branches of all relevant code, and written some scripts for cruisecontrol to find all directories with a /tests/, but I find myself in need of something more.

So, my code is on github for now, and you can see the current CI system where those scripts have installed new jobs.

I'm quite sure that pyrus and a local installation will deal with the dependencies; as they are all described with PEAR's package.xml format. Also; detecting when a package has shifted to github should be fairly easy to tackle, as there is much work underway to deal with migration.

The one area I need to explore is manipulating jenkins jobs via xpath, to understand what parts of a job are already present and what need updating - basically number seven in the above list.

I'm curious who's done this sort of thing before, regardless of language, and if there are any libraries which make it easier to do this sort of thing.


Monday, October 31, 2011

Ausgrid data sets

AusGrid are publishing data on how much people use; plus much more about their network.  It's a pity it's NSW only; but if you were wondering how green your LGA was; this is your answer.

Obviously; a heatmap would make a neat visualisation - but what else could you do with this data?

Saturday, April 30, 2011

Goodreads, Freebase

Goodreads provides an API; and has lots of data about books, editions, reviews and authors, Freebase has quite a lot of data about the same individuals.

It would be good to reconcile all of the Goodreads authors with freebase/dbpedia entries; or to populate freebase with links to goodreads reviews by ISBN.

Tuesday, April 05, 2011

On being busy

Work is busy right now; and I think for the next few years it will be too.



Saturday, March 19, 2011

XML_GRDDL, BestBuy & Good Relations

Digg used to publish rdfa, but it appears to have given it the boot.

So who is out there publishing useful rdfa? Best Buy of course.

While they appear to have sold out of their example rdfa product you can still get a heck of a lot of data out about the store itself.

The code:

$url = 'http://stores.bestbuy.com/577/fairless-hills-pa/products/open-box/frigidaire-30-freestanding-range/0012505540066/?uid=118';

$options = XML_GRDDL::getDefaultOptions();
$options['log'] = Log::singleton('console');
$grddl = XML_GRDDL::factory('xsl', $options);

$data = $grddl->fetch($url);

$data = $grddl->appendProfiles($data, array('http://ns.inria.fr/grddl/rdfa/'));

$stylesheets = $grddl->inspect($data, $url);

$rdfXml = array();
foreach ($stylesheets as $stylesheet) {
$rdfXml[] = $grddl->transform($stylesheet, $data);
}

$result = array_reduce($rdfXml, array($grddl, 'merge'));

print $result;



The result? 80 or so triples come out describing everything from the facebook account of the store; the geolocation; the address; the telephone; their email; their opening hours and more.

Give it a go yourself:

$ pear install -f XML_GRDDL
$ cd /usr/share/php/doc/XML_GRDDL/docs
$ php bestbuy-rdfa.php | less

Friday, March 18, 2011

Rabbit VCS

James points me to Rabbit VCS.

"total clone of tortoise... I'm so happy right now."

Sunday, January 09, 2011

XBMC vs Boxee vs file browsers

I ditched Boxee the other day. I was sick of it eating far too many resources, never quite keeping up with my mouse, fighting me with strange user interface metaphors; ignoring some files it would never recognize and providing me with not much enjoyment.

I installed XMBC and got rid of boxee. That was before I realized how good I had it: Boxee's scraper doesn't work off regexp and demand you do crazy things to your filesystem.

What I am struggling to understand is why I can't simply have a damned metadata layer and scraper for my desktop which plugs easily into the file browser.

All that both Boxee and XBMC are really doing is scrape, fetch, and organise. XBMC does it better with a few widgets like "recently discovered espisodes"; but even then; how hard is it?

I advocate the semantic web stack just because this is the very problem those tools should be able to solve in a trivial fashion; but I don't understand why this has not been achieved by anyone else in the last decade.

It's not like web scraping is mind bogglingly hard, nor matching little bits of string against certain sources. Musicbrainz solved the problem for mp3s some time ago: if the same can't be done for video content, I would be surprised. Additionally, it's not like relational databases are a new thing: they have been around for a bit, and that's all you really need: schema, inserts, and the web scraper organising it all.

So it really, really pains me when the Boxee/XBMC/Gnome world can't get it right. Why does Gnome/Gnautilus still treat media basically like this:

Whilst your original design might have been GUI filesystem explorer, what I actually want from you is GUI-like-XBMC-Metadata-displayer.