Posted in Semantic Web on Apr 18th, 2008 4 Comments »
In determining the meaning of tokens used in communication there are two widely used approaches to disambiguate that I’ll charactise as ‘namespacing’ and ‘context’.
When humans communicate amongst themselves they use the context of the communication to narrow down the range of possible meanings of terms used in the exchange, and human language doesn’t employ namespaces […]
Read Full Post »
Posted in Semantic Web on Apr 11th, 2008 5 Comments »
I like listening to the talis podcasts because they motivate me to think about semantic-web issues. Unfortunately I usually spend the entire session muttering to myself because I disagree with so much that is said.
The issue for me is that speakers often paint a rosy view of a merged data world where ‘if only’ people […]
Read Full Post »
Posted in Semantic Web, web on Dec 15th, 2007 1 Comment »
Thanks to all who commented to my previous post, it’s made me rethink and clarify my position on the problems with scaling Semantic Web technologies. I boiled it down to this:
Semantic Web clients beware: URIs are syntactically universal, not semantically universal.
The rationale for this is that although it is practically impossible for two disconnected parties […]
Read Full Post »
Posted in Semantic Web, web on Dec 12th, 2007 6 Comments »
I only just read Jim Hendler’s piece from last month “shirkying my responsibility”, in which he states that the W3C Semantic Web vision was never about a global shared ontology at all:
“Get it - we are opposing the idea of everyone sharing common concepts.”
This seems odd to me, because if that is the case and […]
Read Full Post »
I’m trying to work out if it’s possible to get the index searching performance I want using a disk-backed store, or whether I need to focus on optimising the indexes to fit totally in memory. The problem is that the optimisation strategies are somewhat different:
Storing indexes on disk:
- Increase redundancy, trading space for better locality […]
Read Full Post »
The new triplestore is coming along. It can do substring text searches (using a suffix array) and has a basic relational query engine. It doesn’t optimise the query plans yet, but if you enter the queries in a good order (most selective clauses first) then you get good performance.
A few things have changed in my […]
Read Full Post »
Posted in Semantic Web, web on May 17th, 2007 No Comments »
Patrick Logan seems convinced that the having a proprietary runtime doesn’t matter because apollo apps can still interact with data on the web.
This doesn’t sound like a great deal to me. For example if I’m unable to run an app because the vendor doesn’t support my OS or device, then being able to fetch blobs […]
Read Full Post »
This link showed up on programming-reddit today. I’ve been reading Tim Bray’s ongoing blog for quite a while, but this piece from 2003 predates my discovery of it. And that’s a big shame because it is quite simply the most brilliantly readable overview to the whole topic of ‘large scale search’ that I’ve ever come […]
Read Full Post »
I wrote a bit about representing structured data in the last post. Here’s some ideas for how I plan to index the data.
Indexing graphs as subject ranges
In indexing triples I need to provide indexed lookups for all 6 of the possible triple query patterns:
s->po
sp->o
p->os
po->s
o->sp
os->p
(s=subject p=property/predicate o=object)
Most mature triplestores also index a 4th query element […]
Read Full Post »
Now that I’m up and running and starting to get productive on Gambit-C, I’ve turned my attention back to indexing structured data.
I’ve modified the tagtriples idea a bit to reflect my experience on importing data in other formats. I still think the most effective approach is not to try and define an interchange format, but […]
Read Full Post »
I tried to comment on Seth’s post, but I think the comments on his blog are a bit broken at the moment (the capcha question wasn’t rendering, so I couldn’t answer it!). I guess I’ll trackback instead:
The path from specificity to usefulness that Seth describes was exactly the trip I took attempting to implement semantic […]
Read Full Post »
I haven’t said anything much about semantic web stuff for a while as I’ve been occupied with other things. However Jim Hendler’s ‘Tales from the Dark Side’ piece in IEEE Intelligent Systems reawoke an old interest. In short: I still think the RDF people have got it wrong with URIs, and so far nobody’s convinced […]
Read Full Post »
This is old news but I’ve only just noticed that Numenta have released their whitepaper about HTM (called Hierarchical Temporal Memory - Concepts, Theory, and Terminology). Numenta is the company that Jeff Hawkins formed with Dileep and others to create products around the ideas in his ‘On Intelligence‘ book.
In short: Hawkins believes he’s got a […]
Read Full Post »
Pretty much all the ‘work’ applications I’ve built in the last couple of years can be split into 2 parts:
1) CRUD: The database, and a UI for manipulating the data in it
2) LOGIC: The actual application functionality. (i.e. the reason for the app to exist)
These days I’m finding that the amount of actual logic/functionality I […]
Read Full Post »
Posted in Semantic Web on Jun 18th, 2006 3 Comments »
I’ve been thinking a bit about scaling triplestores recently. My mysql based tagtriples store is working well at work but is ultimately limited by the amount of memory you can cram into a single machine. I’ve recently become seduced by the idea of putting all the bank’s data (world’s data?) into one massive triplestore, and […]
Read Full Post »
I just did a little bit of testing at work and thought I’d dump it here for future googlers:
We’ve got an internal app team building some REST webservices and they wanted to know the practical limits of GET URL length. Now browser limits are well known, but since the clients will be programatic they wanted […]
Read Full Post »
Posted in General, Semantic Web on Mar 9th, 2006 1 Comment »
Seth posted a response to my ‘global identifiers don’t scale’ post that I didn’t expect. His point is that it is the meaning that isn’t consistent across the semantic web, not the identifiers. I agree with him about meaning being inconsistent, but it’s the distinction that confuses me - in my posts I’ve conflated identity […]
Read Full Post »
Posted in Semantic Web on Mar 9th, 2006 3 Comments »
Thanks to everybody that commented on my ‘global identifiers don’t scale‘ post (and especially for Seth and John’s responses). It’s obvious that I didn’t make myself very clear - sorry about that.
I think I made two main points:
The first was the scalability point (one I failed to make very well at all):
By ‘global identifiers schemes […]
Read Full Post »
When designing large information systems to hold data from a wide range of sources (e.g. a large company inventory or knowledge base), a common approach is to employ a global identifier scheme so that entities can be referenced unambiguously across the system. A really large scale example of this approach is the W3C Semantic Web […]
Read Full Post »
Here’s an attempt at a working definition:
The ‘meaning’ of an identifier = the complete set of assertions that can be made about it.
(for the purpose of discussion of identity in information systems).
E.g. if an assertion is true for one identifier but not another, then they don’t mean (or denote) the same thing.
Does that sound […]
Read Full Post »