Feed on
Posts
Comments

With all the buzz around the possibility of an ‘RDF-Lite’, I feel compelled to list a few barriers that I think URIs raise for a new user trying to get to grips with RDF metadata creation.

Here they are, in no particular order:

(1) URIs don’t allow you to use existing identity schemes.

Apart from existing web resources, URIs currently aren’t used in everyday life which means that a new URI must be created, described, maintained and promoted for each resource you want to describe in RDF.

(2) HTTP URIs have a load of implicit baggage

Hash vs Slash, how to distinguish web and physical resources, how to keep URIs persistent. These must be understood and up-front decisions made with important ramifications down the line.

(3) URIs are URLs

This confuses everybody I try to describe RDF to. Regardless of philosophical debate, the most commonly accepted use of a URL is to denote the address of a web resource. Using them to denote real-world things confuses people and (as a side effect) creates messy debates about how to denote existing web resources, how to describe resources on the web etc…

(4) URIs require a level of precision in ‘meaning’ that is hard to attain.

URIs are globally scoped, which means they need to mean the same thing in any context. Whether this is actually possible on a large scale is a subject for debate, but regardless - using URIs collaboratively and successfully requires a non-trivial amount of upfront thought, documentation and proactive consensus building.

So there you go. Any others?

Viewing 11 Comments

    • ^
    • v
    "Any others?"

    Yes:

    http://laurentszyster.be/blog/public-names/

    of course ;-)

    Public Names provide a data model that:

    1. Captures simple text articulation as unique
    sets of strings in a single semantic field,
    for instance (with CRLF added):

    17:
    6:Public,
    5:Names,
    ,
    15:
    4:data,
    5:model,
    ,
    1:a,
    7:provide,
    4:that,

    2. Allow a simple computer system to validate
    a string of bytes as an *unambiguous* text
    articulation, for instance:

    5:Dawes,4:Phil,

    and use them as Unique Resource Identifier
    with the required properties for a semantic
    application.

    Kind Regards,
    • ^
    • v
    I've come to a few different conclusions regarding the above assertions:

    (1) Why can't you use blank nodes if you can't use URI References? Resources don't need to be named, and sometimes (like in a database-like environment) most resources will be unnammed.

    If you are willing to step up to OWL, then with inverse-functional properties you can still identify things with a "public key" like structure. However, you can do that anyway with any practical RDF application too.

    Also, minting URI References are easy. Here's a URIRef: "data:Jimmy_Cerra". It is a little different from english, but we are working with computer languages not english. Would those people complain about writing their words in languages like Japanese; so are those people having reasonable expetations?

    Also, the only requirement is that URI References are semantically uniform across the graphs you use it in. Problems happen when you merge graphs that have different semantics with the same URI Reference, but sometimes the types of graphs merged are small and managable.

    If you have to merge with large numbers of graphs or with the whole Semantic Web for all of eternity, then I can see where minging URI References is a problem. But that is a social problem with naming itself and not RDF.

    (2) Yes, and no. I've come to the conclusion that the only way to understand the semantics of anything is to ask the author (i.e. human documentation). There is no way to do so via computers. This is the same since the dawn of internet time (from the RFC specs to XHTML to the Atom Publication Format).

    That's one reason nobody likes DTDs, RELAX NG, XML Schema, OWL (sometimes), and others to specify semantics. You can't do so completely for most non-trivial applications, and all those validation technologies are only hints. That's also why everyone loves XML Schema Datatypes: those elements specify semantics rather than provide a framework for specifying semantics.

    (3) Just because some people get confused doesn't mean that others don't. I understand the differences, as to the people I explain them to. Should we throw away calculus because some people don't understand it?

    (4) See (1).

    I used to be really bugged by those problems... but I think I've found enlightenment. The best way to write semantic web software is to assume, like Socrates and Decartes, that "To know that you do not know is true wisdom". I.E. Assume the semantics of nothing in any context and look it up or ask the URI owner.
    • ^
    • v
    Strictly speaking, (4):

    "URIs are globally scoped, which means they need to mean the same thing in any context."

    isn't true, for RDF. URIs don't have meaning they have denotations; denotations are assigned ("distributed") and that can be done in a local scope. In theory, when you merge data, you determine that the same URI has different referents via logical inconsistencies; in practice you have domain experts and data modellers look analyse the data (just like you do with relational database integrations).

    For me, you left out an most important thing, which is lots of URIs in the same place are hard to read. QNames win the readability argument.
    • ^
    • v
    Hi Jimmy,

    Jimmy Cerra writes:

    Why can't you use blank nodes if you can't use URI References? Resources don't need to be named, and sometimes (like in a database-like environment) most resources will be unnammed.

    If you are willing to step up to OWL, then with inverse-functional properties you can still identify things with a “public key” like structure. However, you can do that anyway with any practical RDF application too.


    Actually I attempted to follow this approach at work for a while (ala foaf), and was indeed willing to step up to OWL - my veudas triplestore supported inverse-functional properties for this reason (via a forward-chaining reasoner e.g. see circa sep 2004 if you're interested!).
    It did make things complicated though - IFP smushing was slow, and unless you're going to give people cookie-cutter examples then they really do need to understand IFPs.

    e.g. people don't naturally write:

    <pre>
    <project>
    <name>My Application</name>
    <maintainer>
    <foaf:Person>
    <foaf:mbox>foo@example.com</foaf:mbox>
    </foaf:Person>
    </maintainer>
    <project>
    </pre>

    Unfortunately cookie-cutter examples kind-of miss the point - you might as well be translating people's data into RDF for them. The real goal for me at work was that people could come up with their own data (from their own systems) that could be aggregated and merged usefully, otherwise it's not really worth the trouble.


    ...
    Also, the only requirement is that URI References are semantically uniform across the graphs you use it in. Problems happen when you merge graphs that have different semantics with the same URI Reference, but sometimes the types of graphs merged are small and managable.

    If you have to merge with large numbers of graphs or with the whole Semantic Web for all of eternity, then I can see where minging URI References is a problem. But that is a social problem with naming itself and not RDF.


    I think it's a problem with globally scoped naming. - The RDF model doesn't allow for any skewing of meaning with context. You can't change society, and global adoption is one of the aims of the semantic web.

    To be honest I think this sort-of illustrates a wider point - if you're just going to work on small manageable sets of data then why bother with complex URI and RDF machinery that inhibit adoption? - It strikes me as quite ironic that the very RDF machinery that was intended to facilitate this large-scale aggregation of data actually ends up inhibiting it.
    • ^
    • v
    Bill de hOra writes:
    Strictly speaking, (4):

    “URIs are globally scoped, which means they need to mean the same thing in any context.”

    isn’t true, for RDF. URIs don’t have meaning they have denotations; denotations are assigned (”distributed”) and that can be done in a local scope. In theory, when you merge data, you determine that the same URI has different referents via logical inconsistencies; in practice you have domain experts and data modellers look analyse the data (just like you do with relational database integrations).


    Ok - that makes sense (although I haven't read that anywhere before - but then I'm starting to fall behind with the literature ;-) ).

    Which means that there's probably a lot of scope for simplifying RDF - you can't throw a baby out with the bathwater if it wasn't in the bath to begin with.
    • ^
    • v
    (3) and partly (2): Don't point a gun at a person unless you mean to kill them. Don't point an HTTP URL at a resource unless you mean to retrieve it (or otherwise access it using the Hypertext Transport Protocol). For "real-world" things use tag: or similar URIs.

    This will make the distinction clearer to people and will also avoid wasted network traffic when attempts are made to retrieve the resource.

    I realise I'm in a minority with respect to this opinion on the use of HTTP URLs but I've yet to see a coherent argument against it.
    • ^
    • v
    In theory, when you merge data, you determine that the same URI has different referents via logical inconsistencies; in practice you have domain experts and data modellers look analyse the data (just like you do with relational database integrations).

    Surely, if two or more datasets use the same URI to denote different resources then at least one of them is simply wrong - it is not using the URI in the way that the URI's original minter intended. In practice, you need to have your domain experts fix up the data before the merge.
    • ^
    • v
    Each of the problems you point out are really by design:

    > (1) URIs don’t allow you to use existing identity schemes.
    Exactly because to do so would be ambiguous. How do you know what identity scheme is being used? You could, say, prefix it with the name of the scheme (i.e. myscheme:12345) -- but then you have to unambiguously identity the scheme name. If the scheme name is unambiguous, then you have a URI anyway.

    > (2) HTTP URIs have a load of implicit baggage
    It's not a requirement that people use HTTP URIs. I'd be all for throwing away these, but that doesn't mean throwing away the entire URI concept.

    > (3) URIs are URLs
    Aren't URLs URIs? Same as 2.

    > (4) URIs require a level of precision in ‘meaning’ that is hard to attain. URIs are globally scoped, which means they need to mean the same thing in any context.
    If this weren't the case, no two RDF documents could ever be merged because you would never know if the authors intended their nodes to denote the same thing. But, like it was pointed out, it's not necessarily a problem if this doesn't occur in practice.

    > using URIs collaboratively and successfully requires a non-trivial amount of upfront thought, documentation and proactive consensus building.
    Every naming scheme is going to be like that, to some degree. Do URIs actually require more upfront thought than other schemes, though?
    • ^
    • v
    Joshua Tauberer writes:

    > (1) URIs don’t allow you to use existing identity schemes.

    Exactly because to do so would be ambiguous. How do you know what identity scheme is being used?


    Context tells you this.



    > (4) URIs require a level of precision in ‘meaning’ that is hard to attain. URIs are globally scoped, which means they need to mean the same thing in any context.

    If this weren’t the case, no two RDF documents could ever be merged because you would never know if the authors intended their nodes to denote the same thing. But, like it was pointed out, it’s not necessarily a problem if this doesn’t occur in practice.



    I think when it doesnt happen in practice it's because the people doing the merging know something of the context under which the document is written. You need this anyway - otherwise how do you know that the author of the RDF graph is a reliable source, or even competent in RDF?

    Besides - I think this problem does happen in practice.
    • ^
    • v
    Joshua Tauberer writes:

    > using URIs collaboratively and successfully requires a non-trivial amount of upfront thought, documentation and proactive consensus building.

    Every naming scheme is going to be like that, to some degree. Do URIs actually require more upfront thought than other schemes, though?


    More than localized, context bound schemes - yes. E.g.

    PhilDawes name "Phil Dawes"
    PhilDawes email pdawes@users.sf.net

    didn't require much thought, because it is bound to the scope of this blog comment. It's a bit throwaway, but you still understand what I mean to some degree because you understand something of the context under which I wrote it.
    • ^
    • v
    Do URIs actually require more upfront thought than other schemes, though?
    More than localized, context bound schemes - yes.

    But the really cute thing about URIs are that they form a sort of federation of separate localized context bound schemes. Each URIRef carries around inside itself both the global name of the local scheme and the name within that scheme - all in a reasonably familiar, readable and compact sequence of characters. So no, I don't think URIRefs can require more upfront thought than localized schemes other than the trivial issue of deciding on the first part of the URI used to prefix names in the scheme - to make them globally unique.
close Reblog this comment
blog comments powered by Disqus

generic acomplia purchase cialis overnight delivery cheap acomplia online buy generic clomid buy cialis low price viagra without prescription where to buy cialis lowest price levitra where to buy propecia cheap cialis from canada lasix no prescription viagra without rx cheap accutane tablets viagra online without prescription viagra no rx buying cialis online zithromax viagra in uk free cialis cialis us where to buy acomplia find cialis online buy viagra lowest price accutane prescription buy cheap accutane online cialis buy buy generic cialis online acomplia order propecia online lowest price synthroid synthroid without a prescription synthroid online buy propecia online cheap levitra online where to buy levitra cialis online review synthroid prices cialis generic cialis buy drug buy viagra on line viagra pharmacy cialis for order price of levitra zithromax online where to buy synthroid soma generic generic clomid propecia online stores viagra cheap drug cheap generic soma cialis cheap zithromax online cheap order accutane online purchase zithromax online purchase viagra online buy cheap clomid cheap generic propecia zithromax pharmacy online pharmacy cialis cheapest acomplia cost of cialis no prescription viagra free viagra purchase lasix online cialis from india viagra from india order discount cialis soma online stores find no rx cialis cialis no rx required find viagra without prescription approved cialis pharmacy lasix discount