10 Principles of SOA

In many customer engagements, I need to establish a basic set of principles of SOA. The following sections introduce fundamental principles that a Service-oriented Architecture (SOA) should expose.These are not introduced as an absolute truth, but rather as a frame of reference for SOA-related discussions. You’ll note that the first four are based on Don Box’s four tenets, although over time they may have acquired a slight personal spin. (I’m putting them up now because someone asked me for a version that can be referenced; I haven’t reviewed them again now and may actually have changed my mind with regards to some of them.)

1. Explicit Boundaries

Everything needed by the service to provide its functionality should be passed to it when it is invoked. All access to the service should be via its publicly exposed interface; no hidden assumptions must be necessary to invoke the service. “Services are inextricably tied to messaging in that the only way into and out of a service are through messages”. A service invocation should – as a general pattern – not rely on a shared context; instead service invocations should be modeled as stateless. An interface exposed by a service is governed by a contract that describes its functional and non-functional capabilities and characteristics. The invocation of a service is an action that has a business effect, is possibly expensive in terms of resource consumption, and introduces a category of errors different than those of a local method invocation or remote procedure call. A service invocation is not a remote procedure call.

While consuming and providing services certainly should be as easy as possible, it is therefore undesirable to hide too much of the fact that an interaction with a service takes place. The message sent to or received from the service, the service contract, and the service itself should all be first-class constructs within the SOA. This means, for example, that programming models and tools that are used should at least provide an API that exposes these concepts to the service programmer. In summary, a service exposes its functionality through an explicit interface that encapsulates its internals; interaction with a service is an explicit act, relying on the passing of messages between consumer and provider.

2. Shared Contract and Schema, not Class

Starting from a service description (a contract), both a service consumer and a service provider should have everything they need to consume or provide the service. Following the principle of loose coupling, a service provider can not rely on the consumer’s ability to reuse any code that it provides in its own environment; after all, it might be using a different development or runtime environment. This principle puts severe limits on the type of data that can be exchanged in an SOA. Ideally, the data is exchanged as XML documents validatable against one or more schemas, since these are supported in every programming environment one can imagine.

3. Policy-driven

To interact with a service, two orthogonal requirement sets have to be met:

the functionality, syntax and semantics of the provider must fit the consumer’s requirements,
the technical capabilities and needs must match.

For example, a service provider may offer exactly the service a consumer needs, but offer it over JMS while the consumer can only use HTTP (e.g. because it is implemented on the .NET platform); a provider might require message-level encryption via the XML Encryption standard, while the consumer can only support transport-level security using SSL. Even in those cases where both partners do have the necessary capabilities, they might need to be “activated” – e.g. a provider might encrypt response messages to different consumers using different algorithms, based on their needs.

To support access to a service from the largest possible number of differently equipped and capable consumers, a policy mechanism has been introduced as part of the SOA tool set. While the functional aspects are described in the service interface, the orthogonal, non-functional capabilities and needs are specified using policies.

4. Autonomous

Related to the explicit boundaries principle (5.4.1.1), a service is autonomous in that its only relation to the outside world – at least from the SOA perspective – is through its interface. In particular, it must be possible to change a service’s runtime environment, e.g. from a lightweight prototype implementation to a full-blown, application server-based collection of collaborating components, without any effect on its consumers. Services can be changed and deployed, versioned and managed independently of each other. A service provider can not rely on the ability of its consumers to quickly adapt to a new version of the service; some of them might not even be able, or willing, to adapt to a new version of a service interface at all (especially if they are outside the service provider’s sphere of control).

5. Wire formats, not Programming Language APIs

Services are exposed using a specific wire format that needs to be supported. This principle is strongly related to the explicitness of boundaries principle, but introduces a new perspective: To ensure the utmost accessibility (and therefore, long-term usability), a service must be accessible from any platform that supports the exchange of messages adhering to the service interface as long as the interaction conforms to the policy defined for the service. For example, it is a useful test for conformance to this principle to consider whether it is possible to consume or provide a specific service from a mainstream dynamic programming language such as Perl, Python or Ruby. Even though none of these may currently play any role in the current technology landscape, this consideration can serve as a litmus test to assess whether the following criteria are met:

All message formats are described using an open standard, or a human readable description
It is possible to create messages adhering to those schemas with reasonable effort without requiring a specific programmer’s library
The semantics and syntax for additional information necessary for successful communication, such as headers for purposes such as security or reliability, follow a public specification or standard
At least one of the transport (or transfer) protocols used to interact with the service is a (or is accessible via a) standard network protocol

6. Document-oriented

To interact with services, data is passed as documents. A document is an explicitly modeled, hierarchical container for data. The self-descriptiveness, as outlined in the previous section, is one important aspect of document-orientation. Ideally, a document will be modeled after real-world documents, such as purchase orders, invoices, or account statements. Documents should be designed so that they are useful on the context of a problem domain, which may suggest their use with one or more services.

Similarly to a real-world paper document, a document exchanged with a service will include redundant information. For example, a customer ID might be included along with the customer’s address information (although the customer ID would be enough). This redundancy is explicitly accepted since it serves to isolate the service interface from the underlying data model of both service consumer and service provider. Whena document-oriented pattern is applied, service invocations become meaningful exchanges of business messages instead of context-free RPC calls. While not an absolute required, it can usually be assumed that XML will be used as the document format/syntax.

Messages flowing between participants in an SOA connect disparate systems that evolve independently of each other. The loose coupling principle mandates that the dependence on common knowledge ought to be as small as possible. When messages are sent in a Distributed Objects or RPC infrastructure, client and server can rely on a set of proxy classes (stubs and skeletons) generated from the same interface description document. If this is not the case, communication ceases on the assumption that the contract does not support interaction between those two parties. For this reason, RPC-style infrastructures require synchronized evolution of client and server program code.

This is illustrated by the following comparison. Consider the following message:

2006-03-1347113

and compare it to:

<order>
  <date>2006-03-13</date>
  <product-id>4711</product-id>
  <quantity>3</quantity>
</order>

While it is obvious that the second alternative is human-readable while the first one is not, it is also notable that in the second case, a participant that accesses the information via a technology such as XPath will be much better isolated against smaller, non-breaking changes than one that relies on the fixed syntax. Conversely, using a self-descriptive message format such as XML while still using RPC patterns, such as stub and skeleton generation, serves only to increase XML’s reputation as the most effective way to waste bandwidth. If one uses XML, the benefits should be exploited, too. (See this paper for an excellent discussion of why many current Web services stacks fail this test.)

7. Loosely coupled

Most SOA proponents will agree that loose coupling is an important concept. Unfortunately, there are many different opinions about the characteristics that make a system “loosely coupled”. There are multiple dimensions in which a system can be loosely or tightly coupled, and depending on the requirements and context, it may be loosely coupled in some of them and tightly coupled in others. Dimensions include:

Time: When participants are loosely coupled in time, they don’t have to be up and running at the same time to communicate. This requires some way of buffering/queuing in between them, although the approach taken for this is irrelevant. When one participant sends a message to the other one, it does not rely on an immediate answer message to continue processing (neither logically, nor physically).
Location: If participants query for the address of participants they intend to communicate with, the location can change without having to re-program, reconfigure or even restart the communication partners. This implies some sort of lookup process using a directory or address that stores service endpoint addresses.
Type: In an analogy to the concept of static vs. dynamic and weak vs. strong typing in programming languages, a participant can either rely on all or only on parts of a document structure to perform its work.
Version: Participants can depend on a specific version of a service interface, or be resilient to change (to a certain degree). The more exact the version match has to be, the less loosely coupled the participants (in this dimension). A good principle to follow is Postel’s Law: Service providers should be implemented to accept as many different versions as possible, and thus be liberal in what they accept (and possibly even tolerant of errors), while service consumers should do their best to conform to exact grammars and document types. This increases the overall system’s stability and flexibility.
Cardinality: There may be a 1:1-relationship between service consumers and service providers, especially in cases where a request/response interaction takes place or an explicit message queue is used. In other cases, a service consumer (which in this case is more reasonably called a “message sender” or “event source” may neither know nor care about the number of recipients of a message.
Lookup: A participant that intends to invoke a service can either rely on a (physical or logical) name of a service provider to communicate with, or it can perform a lookup operation first, using a description of a set of capabilities instead. This implies a registry and/or repository that is able to match the consumer’s needs to a providers capabilities (either directly or indirectly).
Interface: Participants may require adherence to a service-specific interface or they may support a generic interface. If a generic interface is used, all participants consuming this generic interface can interact with all participants providing it. While this may seem awkward at first sight, the principle of a single generic (uniform) interface is at the core of the WWW’s architecture.

It is not always feasible nor even desirable to create a system that is loosely coupled in all of the dimensions mentioned above. For different types of services, different trade-offs need to be made.

8. Standards-compliant

A key principle to be followed in an SOA approach is the reliance on standards instead of proprietary APIs. Standards exists for technical aspects such as data formats, metadata, transport and transfer protocols, as well as for business-level artifacts such as document types (e.g. in UBL).

9. Vendor independent

No architectural principle should rely on any particular vendor’s product. To transform an abstract concept into a concrete, running system, it’s unavoidable to decide on specific products, both commercial and free/open source software. None of these decisions must have implications on an architectural level. This implies reliance on both interoperability and portability standards as much as reasonably possible. As a result, a participant can be built using any technology that supports the appropriate standards, not restricted by any vendor roadmap.

10. Metadata-driven

All of the metadata artifacts within the overall SOA need to be stored in a way that enables them to be discovered, retrieved and interpreted at both design and run time. Artifacts include descriptions of service interfaces, participants, endpoint and binding information, organizational units and responsibility, document types/schemas, consumer/provider relationships etc. As much as possible, usage of these artifacts should be automated by either code generation or interpretation and become part of the service and participant life cycle.

Stefan Tilkov's Random Stuff