Protocol Building

, Sep 1, 2007

Once again, time to draw attention to a comment, this time from Benjamin Carlyle:

In short, I think that SOA is fine and a proven technology when it is possible to upgrade your whole architecture in one go to the new protocols. I think that REST is the only proven technology when only a tiny fraction of the overall architecture can be upgraded in a single sitting. You can’t upgrade the whole Web. REST accommodates this.

Be sure to read the full comment, which is longer than most of my blog entries.

On September 4, 2007 4:52 PM, Mike Glendinning said:

Oh dear. Where to begin?

One might simply point out that, in our enthusiasm and advocacy for an idea, we must be careful to resist the lure of specious arguments and avoid jettisoning our rationality.

Of course Benjamin makes some interesting points, but let’s examine these in more detail.

First, Benjamin says “existing applications can’t interact with a new app built with the new WSDL”. Why not? Programmatically, I can find the new WSDL for a service through a registry; alternatively it is a common idiom for a web service to return its WSDL when you issue a HTTP GET on the service endpoint. Once I have obtained the WSDL, I can interpret this programmatically and use it to invoke the new service. In the old days of CORBA, we used to call this “dynamic invocation”. Whilst this approach might still not be common, it’s definitely possible and much easier and more practical today than it was 10 years ago.

Next, Benjamin describes how in REST, most changes are in the modification or addition of document types. True. But I don’t understand his distinction between “a transition from one version of a document type to the next” and “when one content type is superseded by another”, unless he’s just defining two arbitrary levels of change, a minor tweak versus complete replacement. Also, what has this got to do with REST at all? Giving an example from HTML 3/4 he says “The new content type allows new information to be added, but doesn’t take information away”. Which of the REST constraints or principles enshrines this behaviour? Surely it’s just an attribute of the HTML specification. As far as I can see, REST says very little about the nature of document types, other than they should be composed of hypermedia and belong to “an evolving set of standard data types”.

When talking about the “addition of a completely new content type”, Benjamin says “You don’t do this if it is meaningful for old clients to talk to new servers”. But surely this is one of the essential points about the design of evolvable systems. Is he saying that REST is not appropriate if you want to be able to evolve servers ahead of clients?

The problem here is that REST is fundamentally describing a system of intermediaries, not information processing endpoints. This is what the Web is about. We have discussed the risk of this potential “category error” before in [1] and [2]. I do hope we haven’t regressed.

To perform any useful activity, a REST client needs to understand and act on the information returned. That is, it has to be able to make sense of the document types returned and use this knowledge to make decisions about what to do next. In the REST model, this sense-making and decision making is performed by a human being and by interpreting the information displayed on a screen (or spoken) by a Web browser.

If we want to embed this sense-making and decision making in an automated computer program, then such a REST client will need to have an intimate knowledge of all document types and the detailed real-world semantics attached to them. Just knowing XHTML and Atom isn’t enough. For example, I can definitely process an XHTML document and extract all of the “<a>” tags that contain a “href” attribute. But which of these hyperlinks means “view supplier invoices” and which means “create new invoice”? If I receive a list of Atom entries, how do I know whether this feed refers to the supplier’s invoices or to their contact history?

In the future, the Semantic Web may provide a way that we can encode such knowledge into our programs, but in the meantime, this “protocol”, that is the “rules governing the syntax, semantics and synchronization of communication” [3] will need to be hard-coded into our REST clients. This is going to be hard and messy and seems very little different to the web services case.

Now, please don’t get me wrong. I’m not anti-REST in any way. But I don’t think it does anyone any favours to overstate the case. The basic REST principles espoused in the Fielding dissertation are a good start, but are really just concerned with basic communication mechanisms, the “plumbing” if you will. Much more work is needed on exactly how and why we need to define document/media types to achieve the properties we want from distributed systems. And we have some serious security and trust issues to deal with as well. Those are the kinds of problems where we need to be devoting out energy and attention.

I would say more on the fallacy of the “evolvability of the web” argument, but I’ve probably taken up too much space already…

Regards,

-Mike Glendinning.

[1] /blog/st/2006/02/22/more_soap_vs_rest_arguments.html

[2] /blog/st/2006/12/04/the_lost_art_of_separating_concerns.html

[3] http://en.wikipedia.org/wiki/Protocol_(computing)

On September 12, 2007 4:35 PM, Benjamin Carlyle said:

Mike,

I come from a SCADA background. The kind of system I work with has a single HMI that fronts up a number of services. Many of these services themselves retrieve telemetry information with devices out in the field. So in a moderately sized system we might have a few dozen HMIs, a dozen or so clustered services, and a few dozen devices beyond them. These are all geographically distributed and have regression-testing implications whenever they change in a customer architecture that is typically running 24 hours a day. My services typically talk to another half dozen or so external systems for various purposes.

In the late 90s I had the O-O bug, and some of the system was developed with a SOA style typical at the time. I developed one interface to allow the HMI to talk to a historical subsystem. I developed others to talk to several other subsystems. Each time I developed a new interface, I had to implement a driver within the HMI and also write code on the server side to implement the interface.

In contrast, there were a few interfaces built around more traditional SCADA concepts. There was a protocol for fetching and updating the state of a named variable. This could talk to any service that supported the protocol. I didn’t have to write new HMI code for this interface when I introduced a new service to the architecture. Instead of adaptors on both sides of the protocol boundary, I just needed both sides to agree. I just needed the server to map the standard requests fairly literally into its objects.

The traditional SCADA model involves universal addressing, minimising the number of protocols in use, and making the protocol interactions that are used as general as possible. The main innovation of REST over SCADA is to separate protocol not into two parts but into three. SCADA protocols have traditionally been inflexible to the introduction of new information schemas. REST systems are flexible to this kind of modification.

To your points:

“Programmatically, I can find the new WSDL for a service through a registry” - You misunderstand. You don’t get to write new software. You have two configuration items. One is already deployed. You don’t get to modify it. You don’t even get to reboot the server. You are adding or upgrading a different configuration item, typically on a completely different server or cluster. If the new or updated configuration item adds a new WSDL, none of the existing deployed configuration items can talk to it using the new protocol.

This case resembles the extremely rare case in REST architecture where a new method is added without a mechanism for backwards-compatibility with existing architecture. It resembles the moderately rare case in REST architecture where a new document type is added, superseding an existing document type. More common, however, in REST architecture a document type is upgraded to a new version without the whole type being superseded by another type.

There are going to be times when additions of this kind are needed to accommodate new system functionality. RSS added new functionality to the web, and atom superseded it. Developers of O-O systems are used to this being the norm. In REST architectures it occurs at the fringes of the architecture. Millions of web sites are added to the architecture for each new document type or method added.

“what has (forward-compatible document evolution) got to do with REST at all?”. Good point. It isn’t out there in the REST constraints. Perhaps it seemed so obvious at the time that Roy failed to mention it. Perhaps I am reading more aspects of web architecture into REST than is warranted. It is not simply a feature of HTML, however. It is an implication of the principle of least power[1] and the use of must-ignore semantics. You’ll see the same thing in atom, for example.

“REST is fundamentally describing a system of intermediaries, not information processing endpoints”. I disagree. REST is describing a system in which intermediaries are possible. The principles in REST that make intermediaries possible also benefit the end-points. Less code needs to be written or generated on both sides. Components are more likely to work together, and the architecture evolves better. O-O is good at capturing a set of parts at a particular time. REST as embodied in the web is much better than O-O at capturing different versions of the same parts and a wider range of different parts over long stretches of time and space.

“In the REST model, this sense-making and decision making is performed by a human being” - In the HTTP model a human understands the document as rendered. However it is the browser that understands and correctly renders the data it receives. REST doesn’t fundamentally alter the SOA landscape. It modifies the protocols you would use between client and server, but does not ultimately change which clients and servers you would deploy into your architecture. The same requirements would trace to the same software module.

The main difference between REST and O-O is that you don’t keep a registry of interfaces/protocols in your architecture model. Instead, you keep separate registries of client/server interactions and content types that can be transferred in those interactions.

“In the future, the Semantic Web may provide a way that we can encode such knowledge into our programs”. The thing to understand with the Web at large is that is content types don’t carry a lot of semantics… and that is precisely why they are successful in what they do. It is important that we -don’t- have more document types than we need. If we have a document with weak semantics that does not only my job, but yours as well we don’t both need to write web browsers. We can share the one that understands our common document type.

HTML is good enough for a -darn- lot of things. This is central to its success. New content types on the scale of the Web need a lot of buy in and common need. We are only starting now to see formats like atom and iCalendar really start to become important on this wide field. This should not be misunderstood as meaning that the REST style is incapable of dealing with complex semantics. Observing just these document types indicates that is more than capable. In smaller contexts the need for specific rich semantics breeds more and more specific content types. However, the REST way is that there will be far fewer document types than there are services in any particular architecture.

As a SCADA architect, REST and its predecessors reduce the amount of code needed to be written or generated down towards a natural level of complexity. More importantly, they give me powerful tools in maintaining both a product base and individual architectures consisting of many different separately-upgradeable moving parts.

Benjamin. [1] http://www.w3.org/DesignIssues/Principles.html