Decentralizing Media Types

A while back, Sanjiva Weerawarana proposed (via email) a way to decentralize media types. I think the proposal was excellent; Dan Diephouse’s excellent latest blog post reminded me of it again. Here’s a brief introduction to a possible solution for “decentralizing media types”.

The Problem

In a plain HTTP interaction, the Content-type and Accept headers carry information about the type of the data being transmitted and accepted, respectively. You’ve seen these media types in numerous examples, e.g. a typical request or response might have a Content-type header with the value application/xml.

The problem with this approach is that media types have to be registered centrally with IANA. This means that while you can invent your own media types, nobody will know about them — unless you go through the time-consuming process of actually having your media type registered.

What’s wrong with application/xml? Nothing, really, except that it doesn’t tell you anything more than that what is being sent is XML: You don’t have any way to tell what XML it is unless you actually parse and e.g. look at the outer element’s XML namespace.

The Solution

What Sanjiva (and his collaborators, Paul Fremantle, Jonathan Marsh and James Clark) propose is this: Define a single new media type, application/data-format, with a required parameter uri. This uri points to a definition of the data format, like this:

application/data-format;uri=http://mediatypes.example.com/foo/bar

The uri is an HTTP URI that points to an RDDL document, in other words: you can do an HTTP GET on it and retrieve a documentation of the data format that’s both human-readbable as well as machine-processable.

My Opinion

I think this is an excellent proposal, specifically because it does not rely on a centralized authority, and re-uses the namespacing concepts of the Web. It’s also fully agnostic towards any specific data format — you can use your own binary or text format, something like JSON or YAML, and if you pick XML, you’re free to use DTDs, RELAX NG schemas, Schematron or even XML Schema to document it. It’s also great in that it allows for clients with different knowledge about any particular format to do their best to handle it. One client might be hard-coded against the complete string; another might retrieve the RDDL, look for an XSD, and dynamically render some fancy visual representation.

I think the concept could even be extended to allow for querying of supported media types: You could just do a GET on the resource with an accept header of application/data-format and get back the link to the RDDL (if there is any).

Maybe there’s something immediately, obviously wrong with this idea — but if so, I can’t see it. It will be interesting to see what others say …

Stefan Tilkov's Random Stuff

Decentralizing Media Types

The Problem

The Solution

My Opinion

Comments