Saturday, January 12, 2008

ActiveSesame: In Progress

Working in the world of medical informatics it's impossible to avoid ontology hype. Through the '90s the answer to every problem of disparate medical terminology between institutions, departments, and IT systems was coding. Let's give everything it's own unique code and it will all be okay.
return coding == failure? build_ontologies : save_the_world!
It failed, adding yet another layer of complexity to an already confused industry. So the world started building ontologies. RDF (resource descriptions framework) and OWL (the web ontology language, renamed from WOL when Time Finnin said wol was a stinky name and suggested swapping letters) is all the rage. Web 3.0! Well the Cyc project has been working since the 80s on a true ontology for the world... Ever hear of the Psych project? Well that should tell you something... But the thing that urks me about the situation isn't the dreamers. We need those senior academics... maybe. What is missing is good tools for using ontologies in programs. So a while back I convinced my boss that I should do something about it and started work on ActiveSesame. I wanted a Ruby Gem that would be the ActiveRecord of Triple Stores with the obvious bonus: ActiveSesame would have to read the ontology and build Ruby objects on the fly. Why? Isn't ActiveRDF out there? The sad answer to that question is that I couldn't get ActiveRDF to connect to a triple store. I tried for a few days and gave up. Now, that could be my fault entirely and I'm excited for what the ActiveRDF cats are doing, but it's good to have a few projects to choose from. And as of today the ActiveRDF docs still bite. So here is a feature list of what I'm shooting for before this years RailsConf.

ActiveSesame Feature List:
  • Connect to and Interact with the AllegroGraph triple store via the Sesame Protocol
    • Make SPARQL queries
    • Add Triples to the store
  • Build Ruby Objects based on SPARQL xml return data
    • Build Classes whose instances are RDF individuals
      • Build dynamically or Declared as a Model allowing application specific extensions
    • Instances to include methods for all RDF attributes with a Domain of the class of the individual
    • When attributes range is an RDF class and not a literal build new build a new Ruby Object(s) based on cardinality rules
    • Handle Blank nodes
  • Save RDF Objects to the triples store
    • New Objects can be saved with .save
      • Include a uniqueness check
    • Objects already in the triple store can be updated
      • Only the attributes which have been changed will be updated
  • Abstract common SPARQL Queries into .find method
    • Grab classes by namespace (MyRDFClass.find(:first))
    • Build RDF xml to datatype methods to make find_by_sparql easier to use

So It's a big list, but I have a major chunk of it done. Certainly enough to want to show it off a bit at RailsConf. It wasn't terribly hard actually. Ruby has some excellent meta-programming features. I say excellent, but really they should be in every language. The ideas behind them are not new. I've been hammered with a lot of other things at work recently and haven't been able to work on it for a while, but if my rails conf proposal is accepted then I'll have great reason to demand more time to work on it from my employers.

No comments:

Post a Comment