Feed on
Posts
Comments

I’ve been using sparta quite a bit recently, and have found it to be a really cool way of handling rdf in python. For me it’s about 80% there.

The main problem is that fundamentally there’s a mismatch between the way python represents objects and rdf constrains resources: In python, an object has at most 1 value for each property. In RDF a resource can have multiple values of the same property (corresponding to multiple subject,property,value statements).

Sparta currently handles this mismatch by using a generator interface for each property allowing you to get each value by doing:

  for value in myresource.property:
      # .. do something with value

This is great for properties that do have multiple values, but I find that most properties in my data don’t follow this rule and I have a lot of:

  a = myresource.property.next()

lines littering my code. Also when setting values on a resource I have to delete the original value before setting it (in order to overwrite it).

  del(myresource.property)
  myresource.property = "newvalue"

Sparta does support a feature that if there exists a statement setting the ‘owl:maxCardinality’ of a property is ‘1′, it treats the property as a pythonic attribute, allowing you to do

myresource.property = “newvalue” # overwrites the old value
print myresource.property # prints “myvalue”

Unfortunately I can see 2 problems with this feature:

  • Using it requires you to effectively declare your properties before use (which inhibits my python coding habits)
  • The addition of an owl:maxCardinality statement can cause existing code to break

Instead, I was wondering if the maxCardinality=1 behaviour could be the default. Something like the following:

.
  print resource.property              # get the value of 'property'. If there are
                                       # multiple values then arbitrarily pick one
                                       # (or maybe raise an exception)
.
  for value in resource.property:      # iterate over all values of 'property'
      print value
.
  resource.property = "foo"            # remove any existing 'property' values
                                       # and assert "foo"
.
  resource.add('property',value)       # add an additional property/value statement
                                       # for resource

Seems quite natural and doable to me - can anybody see any problems with this?

Viewing 1 Comment

    • ^
    • v
    This would seem to be a good match for descriptors... Make property a descriptor which checks maxCardinality, if it's 1 return the property, if it's more return a generator. The __set__ on the descriptor could take care of setting the property..
    The only thing I can see which might cause problems is presenting a consistent API, i.e. how are you supposed to know that it's a generator as opposed to a value/object/whatever gets returned.
close Reblog this comment
blog comments powered by Disqus