Feed on
Posts
Comments

Archive for the 'Semantic Web' Category

When designing large information systems to hold data from a wide range of sources (e.g. a large company inventory or knowledge base), a common approach is to employ a global identifier scheme so that entities can be referenced unambiguously across the system. A really large scale example of this approach is the W3C Semantic Web […]

Read Full Post »

Here’s an attempt at a working definition:
The ‘meaning’ of an identifier = the complete set of assertions that can be made about it.
(for the purpose of discussion of identity in information systems).
E.g. if an assertion is true for one identifier but not another, then they don’t mean (or denote) the same thing.
Does that sound […]

Read Full Post »

I noticed that the podcast mp3s from the future of web apps conference a couple of weeks ago are up on the site.
I really enjoyed and would especially recommend Tom Coates ‘Native to a web of data’ presentation for those that are interested in web 2.0 and semantic webby stuff. His slides (to go with […]

Read Full Post »

I was having a discussion with somebody at a conference recently about merging data on a large scale, and how the differing ‘contexts’ under which the data is created make global merging difficult. He asked me to define what I meant by the ‘context’ of the data. It’s a wooly term I’d been using for […]

Read Full Post »

My recent look at microformats has lead me to think more about the levels of grey between being able to fully interpret (understand) data, and not being able to interpret it at all. Microformats are currently very binary in this regard - either the software knows the microformat and is able to interpret it, or […]

Read Full Post »

Have spent some spare time looking at microformats recently (and more importantly, writing a microformats parser).
The main thing that troubles me is that microformats have no explicit way of conveying the structure of the data. This scuppers the idea of a general microformats importer (which I would obviously like for JAM*VAT, amongst other things).
There are […]

Read Full Post »

I couldn’t find a way to comment on Benjamins post, so I’ve stuck it here:
What indexing are you using? My tagtriples store schema is basically a table with 4 ids which joins to an (ID,String) table. When it used to be an RDF store this held both literals and URIs.
I found the key to getting […]

Read Full Post »

Two ways to enable the disambiguation of data:

ensure that each identifier is namespaced so that it can’t collide with any other

or
allow identifier names to collide, but also capture context to enable the consumer to disambiguate

RDF relies on the former approach, English the latter.
N.B. An advantage of the latter approach is that it allows the […]

Read Full Post »

I’ve recently been experimenting with ways to provide simpler structured searching/querying to ‘normal’ web users (i.e. not techies). Sparql/SQL querying doesn’t cut it here - we need something simpler.
One approach I’ve been trying is allowing simple query constraints in with the text search facility. Using the proximity searching capability JAM*VAT then finds a collection of […]

Read Full Post »

Using a relational database as a triplestore backend has a number of advantages - one of which is leveraging features of the backend SQL support with very little effort.
I’ve recently added a whole bunch of functionality to ttql (the experimental query language that JAM*VAT uses for querying). These include:
SQL (mysql) numeric and string functions […]

Read Full Post »

Danny! Here’s another

Read Full Post »

It’s just occurred to me that I never posted about the proximity search capability that I built into JAM*VAT about 3 months ago.
It works by looking for symbols in close proximity. E.g. searching for ‘Danny Ayers Blog‘ yields an answer ‘raw’, even though the word isn’t in the search string. This is because the ‘raw’ […]

Read Full Post »

JAM*VAT is now mature enough that it handles relational operations over large amounts of aggregated structured data quickly and scalably, and also provides very fast regex text search operations (due to its inbuilt suffix array implementation).
However one area where it doesn’t perform very well is in handling dates and numbers. E.g if you aggregated 10000 […]

Read Full Post »

I’m quite excited about this release - it includes new POST functionality that accepts HTTP-POSTed content interpreted via mimetype. The upshot of which is that people can cut-n-paste xml chunks into JAM*VAT (which is a compelling way to demonstrate the technology).
You can try it via the online demo - click on the ‘Post Data’ link […]

Read Full Post »

Here’s a barrier to successfully using RDF URIs for identifying things collaboratively: you need to know the URI before you can use it.
If two parties create URIs for the same thing in seperation, the chances of them minting the same URI are pretty much nil. This is especially true with temporal seperation - you […]

Read Full Post »

With all the buzz around the possibility of an ‘RDF-Lite’, I feel compelled to list a few barriers that I think URIs raise for a new user trying to get to grips with RDF metadata creation.
Here they are, in no particular order:
(1) URIs don’t allow you to use existing identity schemes.
Apart from existing web […]

Read Full Post »

In the last post I mentioned importing XML into the JAM*VAT tagtriples store. One of JAM*VATs main features is that it can translate *any* XML into a tagtriples representation using some simple heuristics. I thought I’d better elaborate on this, especially as I haven’t documented the heuristics anywhere.
Before starting, I ought to point out that […]

Read Full Post »

Ian Davies has been discussing the complexity of RDF and considering the possibility of an RDF Lite. Danny Ayers also picked it up here.
Readers of this blog will already know that I struggled with teaching RDF’s complexity when attempting to promote and deploy it at work. This prompted my research and creation of a simpler […]

Read Full Post »

I’ve just put up v0.7.5 of JAM*VAT. This improves the RDF uri-to-symbol heuristics and adds some stuff that we needed at work - mainly arithmetic comparisons in structured queries. Have also put the new release on the demo site.

Read Full Post »

I hardly got any response to the launch of my JAM*VAT structured aggregator tool. Either that means that nobody’s got a use for it, or they just don’t understand what does. I’m hoping it’s the latter, so I thought I’d post some things to try with the demo installation. Here’s a first stab:
Getting a […]

Read Full Post »

« Prev - Next »