URIs considered harmful?

How about that? Remember the discussions from months ago (“WS-Web” and “Names and Addresses – a different view”) about the identity of resources and their relationship to HTTP URIs? I just read this story about Netscape taking down the resource representation of the RSS 0.91 DTD Schema at the end of the http://my.netscape.com/publish/formats/rss-0.9.dtd URI. They brought it back but with an expiration day in July. I may have missed it but does RFC 2616 say anything about URI expiration?

Yes, agents around the world should cache the resource’s representation; they shouldn’t retrieve it every time it is used; they should be implemented defensively against the brittle, loosely coupled nature of the Web. Still, the resource’s representation is retrieved 4M times per day. Now, if the resource was associated with a protocol-independent URI that could be retrieved/searched/indexed irrespective of the protocol/technology used to access it, we wouldn’t have this problem.

I think the coupling of identity with protocol semantics makes it more difficult to program defensively.

Please, be gentle in your comments 🙂

3 responses to “URIs considered harmful?”

  1. Mark Mc Keown

    Hi Savas,

    They should set the expiration date to a year, which means you can cache it forever – HTTP only allows a maximum cache time of 1 year which is a bug. People use 1 year to mean infinity. HTTP sucks 😉

    To the wider problem, maintaining the identifer to object(*) relationship. NO technology can guarantuee that the mapping between an object’s identifier and its location will always be good. It is a social/organizational problem. The paper on ARK, explains it better than I can: http://ark.cdlib.org/arkspec.pdf. Netscape should have set up HTTP redirects to say the resource had permanently moved, the technology was there, but they just didn’t use it. It is equivalent to not updating the index for mapping identifiers to locations in other approaches to naming.

    HTTP response code 410 says the resource is permanently gone. Maybe Netscape didn’t want to serving those 4M requests a day 😉

    Some people think that the http URI scheme is protocol independent: http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html#protocol_independent

    cheers

    Mark

    * Using the term “object” in the broadest sense.

  2. Most importantly, they shouldn’t need the schema.

  3. Martin Probst

    The point is that they want to kill the object. They don’t want you to access it, it’s supposed to be gone, hush, go and use Atom.

    You might argue that it sucks taking down something which is obviously in use, but hey, that’s a different topic.