Implementing graphs as triple ranges
Mar 24th, 2005 by Phil Dawes
Had an implementation thought on the train this morning:
Instead of implementing a graph as a single id in the 4th column of the quads table, how about implementing it as a range of triple ids. i.e. Each triple is given a sequential id in the 4th column when asserted, and the graphs are stored as (startid, endid).
Advantages:
- subgraphs can also be captured (useful for ‘quoting’)
- the order of the graph is preserved
(useful for retaining graph order for e.g. signing, and handling ordered collections without requiring an ordered collection type) - the number of statements in a graph is implicitly stored (endid - startid)
Disadvantages:
- Cant use a UNIQUE(s,p,o,g) index to assert that the same statement doesnt occur twice in a graph.
(Is this a problem? not really - can put up with duplicates) - Graphs can only be asserted one at a time.
(or could lease a block of ids or segregate the id space) - Store size limited by length of id field.
(actually I suspect other factors will limit scalability before this, and it could be made 64 bit) - Queries involving SOURCE become more tricky to implement internally.
(WHERE triple.graph < g.start AND triple.graph >g.end)> - Can’t add triples to an existing graph
(well, not without re-allocating ids or maybe using a cluster field in the graph table or something)
Hmmm… Subgraphs and preserved ordering are compelling advantages - this is sounding like a good tradeoff to me…

Add New Comment
Viewing 3 Comments
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Add New Comment