New Languages

,

Being a programming language geek, I typically try to use the Christmas vacation to learn (or rather, play with) a programming language I don’t know. This year, I find this very hard, as there are so many interesting languages one could spend time with. Some I have considered are:

  • Go: I was thoroughly unimpressed with this language when it came out, and I still fail to see a lot of interesting stuff in it. But I’ve heard many people I respect say only good things about their experience with it, so maybe I should reconsider.
  • Rust: At first glance, this seems to be a very nicely designed language (and it has a really excellent tutorial). Even though its language features are very advanced, it seems to be intended for low-level use cases (that I mostly don’t have).
  • Fantom: Seems to be interesting, too; I remember I looked at it a long time ago, but never in depth.

What do you think? What else is worth a look?

Some Thoughts on SPDY

,

What follows is a number of claims about SPDY, none of which I can back up in any reasonable way. My hope is that readers who know more will educate me. SPDY, in case you don’t know it, is a replacement for the HTTP wire protocol. It originated with Google and aims to preserve HTTP’s application semantics while improving its efficiency and performance.

  • Supporting true multiplexing on top of a single TCP connection is great. There is no way anybody can prefer the HTTP 1.0 model, which forces a separate TCP connection per request, nor the HTTP 1.1 model, which allows for persistent connections but still requires serialized request/response interactions (never mind HTTP pipelining as it doesn’t work in practice). Browsers having to open separate connections to the same server to achieve parallelism is not a satisfactory solution.
  • Reducing header overhead is also an excellent idea. I’ve heard some criticism about the way this is actually done in SPDY, but it very clearly serves no purpose to have a browser send e.g. the same ridiculously long user agent string with each and every request.
  • I used to not care much for the push support, i.e. the opportunity for the server to actively send stuff to the client, for the same reason I’m not a fan of Websockets: I don’t think you actually need this in practice on the application level. But in a session done by Jetty developer Thomas Becker today, I learned about a quite intriguing usage of this in Jetty’s SPDY implementation: On the first request of a page and the subsequent request for the resources that are referenced in that page, Jetty will build a map based on the Referer header – it essentially remembers which secondary resources a page references. When the next request comes along, it can actively send the referenced resources before the client actually asks for them.
  • I think the fact that SPDY requires TLS is a mistake. While I totally concede that most traffic on the Net is (and should be) encrypted, there are many use cases e.g. within an organization or for public information where this does not make sense. Besides, it prevents the usage of intermediaries, even though I admit these will be much harder to build for SPDY than for plain HTTP anyway.
  • While SPDY proponents point to impressive performance improvements, they are the more impressive the worse the website is implemented. For sites that are already optimized in terms of front end performance, e.g. minimize and compress content, minimize the number of requests, usage proper caching, the effect is going to be much less. That said, some of the things we do in terms of optimization, e.g. combining multiple CSS or JS files into a single one, are not exactly milestones of architectural purity.
  • For machine-to-machine communication – i.e. typical RESTful web services – I don’t think SPDY will have the same kind of effect as for Web sites, but I’m willing to let myself be convinced otherwise.
  • One of the sad but inevitable things when introducing a binary protocol as opposed to a text-based one is reduced transparency for humans. If SPDY becomes successful – and I have small doubts it will – being able to telnet to a servers port 80 is going to be what I miss most.

SPDY has a very good chance of essentially becoming the new HTTP 2.0, and I’m happy about it: I’m pretty confident the HTTP WG with the formidable Mark Nottingham taking care of things will produce something that will be usable for a long time to come.

innoQ Company Events

,

Since the beginning of innoQ’s existence, 13 years ago, we’ve maintained a regular schedule of so-called “company events”. In my opinion, this is one of the really, really great things about innoQ, and it’s also quite different from what others do. Which is a sufficient excuse for me to write this …

So what’s an innoQ event? All of innoQ’s consultants meet at some more or less remote venue and spend two or three days there, discussing projects, technology, methods, in general: anything of interest to our work. Most of the time we use the classical conference format (an innoQ guy presents something, followed by sometimes controversial discussion), but we use other approaches, such as open spaces, pecha kuchas, and lightning talks, too. We occasionally invite guests (we were lucky to have e.g. Steve Vinoski, Michael Hunger, Markus Voelter, Michael Nygard pay us a visit). While the location is mostly somewhere in Germany, we go to Switzerland sometimes, and one event per year is reserved for a trip “far far away” (in the past years we went to e.g. Prague, Barcelona, Paris, Rome, Budapest, and Strasbourg; these are the only events where we actually spend a day just sightseeing). Some of the events focus on project reports, others are programming events, one event per year is dedicated to company strategy.

What is amazing to most people I talk to about this is the frequency we do this with, and the resulting amount of time, effort and money we invest. We do 6-8 events per year, 2 of them three days long, the rest two days. Events are always scheduled during regular workdays, typically Thursdays and Fridays; attendance is mandatory. This adds up to 15-18 days per person per year, with the most significant cost factor being the “lost” revenue. Of course there’s also a lot of effort involved in organizing the whole thing: The colleague who does this essentially organizes 6-8 small conferences (we’re regularly about 50 people these days) per year (no small feat; thanks, Thomas).

It’s worth every single cent.

Company events are among the very best things about innoQ. They serve to help us to spend some quality time discussing various topics, whether it’s company strategy, a new programming language, library, framework or product, a new approach to project management, or some very project-specific problem. We’re also able to invite candidate employees to a setting where they have a great chance to get to know how innoQ works.

Most importantly of all, they’re fun. We spend a lot of time doing geek things, but there’s always time for great food, occasional drinks, and socializing and talking about other important things in life.

So if you see me tweet from Barcelona during the next three days (where I plan to spend some time with the works of one of my favorite artists tomorrow), you know why I’m there.

Hypermedia Benefits for M2M Communication

,

Hypermedia is the topic that generates the most intensive discussions in any REST training, workshop, or project. While the other aspects of RESTful HTTP usage are pretty straightforward to grasp and hard to argue with, the use of hypermedia in machine-to-machine communication is a bit tougher to explain.

If you are already convinced that REST is the right architectural style for you, you can probably stop reading and switch over to Mike Amundsen’s excellent “H Factor” discussions. That may be a bit tough to start with, though, so I thought it might make sense to explain some of the ways I use to “pitch” hypermedia. I’ve arrived at a number of explanations and examples of meaningful hypermedia usage, explicitly targeted at people who are not deep into REST yet:

  • “Service Document” and “Registries”: Especially for people with an SOA/SOAP/WSDL/WS-* background, the idea of putting at least one level of indirection between a client (consumer) and server (provider) is well established. A link in a server response is a way for a client to not couple itself to a particular server address, and a server response including multiple links to “entry point” resources is somewhat similar to a small registry. If providing links to actual entry point resources is the only purpose of a resource, I’ve become used to calling it a “service document”. Of course these documents themselves can be linked to each other, allowing for hierarchies or other relationships; they can support queries that return a subset of services; they can be provided by different servers themselves; and they are way more dynamic than a typical registry setup. In other words, the very simple approach included in HTTP is far more powerful than what most crazily expensive registries provide.
  • Extensible contracts: The merits of being able to link to resources can be exploited to add functionality to existing systems without breaking their contracts. This is most visible in server responses that have one or more places where you can put additional links. As your server or service evolves, you can add links that will be ignored by existing clients, but can be used by those that rely on them. The concept of “a link to a resource” is both sufficiently generic and specific to be meaningful enough, especially if you include a way to specify what they actually mean via a link rel=… approach (but more on that in a separate post).
  • Co-Location Independence: What I mean by this slightly weird term is the fact that while resources that are exposed as part of a service interface are sometimes (or maybe even often) designed in a way that requires them to be part of the same implementation, they very often are not, i. e. they could at least in theory be part of a different system. (In fact you can reasonably argue that there should be no assumption about this at all, neither on the server nor the client side for something to be rightfully called “RESTful”, but I simply haven’t found that to be doable in practice.) In those cases where resource don’t need to be hosted by the same implementation, you can and should couple them via links and have clients navigate them instead of relying on common URI paths.

There are quite a few more examples to talk about, but I won’t do that now as I promised to publish something today and don’t want to get into the habit of keeping drafts lying around for too long again. (I know, lenghty this is probably not. Sue me.) So please let me know in the comments what you think of these three if you’re just starting to pick up REST, and what additional ways of explanations you use if you already have done so.

Waterfall Sucks, Always. Duh.

,

Today, I had an interesting discussion over Twitter related to project organization in restricted environments. (Update: I’ve removed all references to the actual discussion because the person I interacted with felt mis-quoted, and I don’t think that it was actually that important with regards to what I actually wanted to get across.) This prompted me to take a break from my usual topics and elaborate a bit on my thoughts with more than 140 characters. All this assumes you’re in the role of not only having to actually deliver a piece of software, but also to get the necessary funding – regardless of whether you’re part of an internal organization that requires approval from top management or you’re an external partner that needs to sell to its customer. That said, I’ll focus on the second scenario as that is my primary focus at innoQ.

First of all, in an ideal world, the customer understands all of the reasons that led to the Agile movement, e.g. accepts that an upfront specification that’s 100% correct is an unattainable pipe dream, agrees to participate in the actual development process, and most importantly, understands that what needs to be built will only become clear during the actual development project. We do have some customers who understand this very clearly, and they agree to our favorite pricing model: We offer a sprint/iteration for a fixed price or on a T&M basis, and after each sprint the customer can decide to continue or to stop (which will require one final short wrap-up phase). This reduces the customer’s risk, which is often seen as a benefit big enough to outweigh the perceived disadvantage of not knowing what the overall cost will be. It’s great to be able to work in an environment where everybody’s goals are perfectly aligned, and this is the case in this model.

Unfortunately, this ideal model is not always an option. Of course one way for a development organization to ensure that all projects are done this way is to simply refuse doing it in any other fashion. That’s a good option, but whether it’s actually doable strongly depends on internal power distribution or external market forces.

But what do you do when you have to accept a different set of restrictions? For example, the customer/stakeholder might require a fixed-scope, fixed-time, fixed-price offer. My guess is we can all agree that this is bad idea for everyone involved. But how do you approach things if you just have to do things this way? What do you do if, as an additional downside, the developers assigned to the project are not as skilled as you’d like the to be?

As possible answer might be to use a classical waterfall approach, but I think this is never a good choice. At the very least, go with an iterative approach, even if that means you have to game the system to do that.

Of course you have to put up some effort into an initial up-front analysis. You’ll be aware that much of what you find out may actually turn out to be wrong, but it’s still better to make a slightly more informed estimate up front as opposed to a totally bogus one, especially if you’re an external partner that’s supposed to provide a fixed-price quote. Then, make sure that you grow the system in increments – i.e., build a first working system, using a number of interesting use cases; then add functionality in the next iteration, and continue until done.

Typically, this will resemble something like an agile process – but with slightly larger iterations (e.g. maybe 6 weeks instead of two), and with the added amount of documentation required to fulfill the typical waterfall requirements. (If this reminds you of a Unified Process kind of thing, that’s no coincidence.)

In the end, you’ll have created all of the documents and other artefacts required, but simply not in the order they were supposed to be generated (first analysis, then design, then implementation, then test), but with the trimmed-down focus of each iteration.

Is this perfect? Not even remotely. But in my experience, you have a far greater chance to meet your goals than with actually following the waterfall approach, and even more importantly, management is likely to accept it (partially because it’s obvious, partially because you don’t tell them about it).

If you can’t get away with that, you’re really out of luck, and it’s as they say: You need to change the company, and if you can’t, change companies.

Announcing “ROCA”

,

In the past few days, we finally managed to write down a few ideas on Web frontend design, basically a set of rules to apply if the goal is to come up with a Web app that is actually on the Web as opposed to be tunnelled through the Web. We tried to come up with a catchy name, and finally arrived at ”ROCA”, a conveniently pronouncable acronym for “Resource-oriented client architecture”.

I am aware that for many folks, specifically those who are interested in REST and thus likely to read this, a common reaction might be “Duh”. And one of the main things I’d like to stress is that we have not invented a single thing, but rather collected a certain set of rules that we found many people liked, but couldn’t name.

Since we started discussing this, we’ve found strong support, as well as violent opposition. Which is exactly what we were looking for, because in only very very few cases, people didn’t understand what we described, and that’s the whole point of the effort: Give a name to a certain cohesive set of practices so that they may used as a reference both when you agree with them, want to build a system that adheres to them or criticize them because you disagree.

I’m looking forward to comments over at the discussion page. If you believe we should make changes, please fork the GitHub repo and create a pull request.

REST vs. Websockets, Oh My

,

There is an entirely absurd discussion going on about “REST vs. Websockets”, predictably claiming that in the long term, REST will be succeeded by Websockets. This is stupid on many levels. I’ll try to be brief:

  • To be pedantic, REST vs. … almost never makes sense, as people are rarely talking about REST (the architectural style) in comparison to another architectural style. But never mind, let’s assume that what was meant was actually “RESTful HTTP” vs. “Websockets”, then …
  • Websockets is not something “more”, it doesn’t add something, it’s not dynamic, or interactive, or in any way “good” – unless you make the same claim about TCP. Websockets essentially allows you build your own proprietary protocols that may or may not be great, with all the typical advantages and disadvantages these end up having: possibly better performance, possibly better suited to the specific task at hand, less standardized, not widely implemented, etc. It’s not a case of one being better than the other, it’s about being different.
  • In the long run, HTTP (used in a way aligned with its architectural goals) will continue to have benefits for loosely coupling systems. If that’s what you want, it makes the most sense. If you’re after the most efficient communication possible, and are willing to sacrifice some of the loose coupling – fine, go ahead, use Websockets. But it’s not as if one will supersede the other.
  • Does this mean I claim that HTTP is perfect? Of course not, it most definitely could be improved. But if this improvement comes, it’s definitely going to introduce more, not less constraints.

PATCHing Rails

,

As mentioned on the Ruby on Rails weblog, Rails 4.0 will include (optional) support for partial updates via PATCH, a change included to better comply to the HTTP spec and the REST architectural style. As you can guess, I really like this motivation, even though I think it’s insufficient to justify major changes – being “RESTful” should not be the goal, building a better system should be. It seems the Rails team has found a good way to do this, as the change is made in a backwards-compatible fashion (so if you don’t care, you can simply ignore it). But it highlights one of the things I really, really like about Rails: It tries to make it a lot easier to build something that’s RESTful than something that isn’t, and its reach means many more people will be exposed to this as the way it’s being done.

So what about the change itself? When should you use PUT, POST, PATCH? First of all, these are the truths I base my views on:

  • POST can mean anything; its most common use is to create something under a location determined by the server; it’s neither safe nor idempotent nor cachable; it should be used whenever using any of the other methods violates one of their guarantees.
  • PUT can mean creation or update; it affects the resource to which it is applied; it’s idempotent; it contains a full representation of the resource (as far as the client is responsible for it). Most importantly, by using PUT the client asks the server to store (in the widest possible sense) the representation under the location provided.
  • PATCH, a relatively new verb (at least in it’s standardized form) is intended to address partial updates, i.e. it updates only parts of the resource it’s being applied to; it’s not idempotent; the client asks the server to change parts of a resource.

If you’re a server developer and want to enable your clients to update only parts of a resource – say, a customer’s address –, you basically have three options:

  • POST the new address to the resource and have the server decide what to do – in this case, process the address change only – based on the content
  • Expose each part you want to be changeable individually as a resource in its own right and use PUT, i.e. make address a resource http://…/customer/:id/address that you can PUT to
  • Use PATCH to put information about the intended change to the resource itself, using an appropriate format understood by the server

Using POST is OK, but only because it essentially means nothing. The PUT option is perfectly fine, but requires you to explicitly create resources for this purpose. This is actually the best option in many cases, especially if the resources you create in the process turn out to be meaningful in their own right, support other methods (GET in particular). It often ends up feeling a bit contrived, though, so it’s nice to have the third option: Using PATCH means you are being very clear about the purpose of the request, and don’t need to create new and possibly otherwise unnecessary resources. It’s still fully RESTful because PATCH is an extremely generic method.

Note that while using POST for partial updates is OK, using PUT (as Rails does) is not, because it violates the behavior as defined by the spec. So changing it is a very good idea, and the only two options are POST and PATCH.

Even though PATCH is clearly useful, and has the ultimate REST authority’s blessing, my recommendation in the past has been to avoid it because you can’t count on anyone (or anything) supporting it, and go with PUT or POST instead. With Rails’ influence, I see this changing – and I very much look forward to being able to include PATCH in my RESTful designs in the future. Add PATCH (in addition to PUT and DELETE) to HTML 5, and I’ll be more than happy …

Blog Update

,

I took the opportunity of a 10-hour flight to finally re-import all of my blog’s old entries into Octopress, fiddling around with a semi-automated process of Ruby and shell scripts and some creative Emacs macros (check out the archives if you care, even though I have no idea why you would). So if this worked, all of the old stuff should be preserved, even though most of it now only has historic value (and prevents Google searches to hit a 404). I’ve also taken the risk of moving the whole stuff back to its old location, and will redirect the temporary one back once I’m sure everything works. One less excuse to not take up blogging again …

XPath and XML

,

Aristotle Pagaltzis has written a very nice and concise XPath intro. Of all the various standards in the XML ecosystem, I like XPath best, most of all because it enables interaction with an XML document or message in a way that matches Postel’s law: It’s a great way to implement code that doesn’t break each time a minor change is made to an XML format. In fact I’d say it’s one the few very good reasons to stick with XML instead of adopting JSON – even though there are things like JSONPath, they don’t have the tool support and standardization you get from XPath.