Stefan Tilkov's Random Stuff

Waterfall Sucks, Always. Duh.

| Comments

Today, I had an interesting discussion over Twitter related to project organization in restricted environments. (Update: I’ve removed all references to the actual discussion because the person I interacted with felt mis-quoted, and I don’t think that it was actually that important with regards to what I actually wanted to get across.) This prompted me to take a break from my usual topics and elaborate a bit on my thoughts with more than 140 characters. All this assumes you’re in the role of not only having to actually deliver a piece of software, but also to get the necessary funding – regardless of whether you’re part of an internal organization that requires approval from top management or you’re an external partner that needs to sell to its customer. That said, I’ll focus on the second scenario as that is my primary focus at innoQ.

First of all, in an ideal world, the customer understands all of the reasons that led to the Agile movement, e.g. accepts that an upfront specification that’s 100% correct is an unattainable pipe dream, agrees to participate in the actual development process, and most importantly, understands that what needs to be built will only become clear during the actual development project. We do have some customers who understand this very clearly, and they agree to our favorite pricing model: We offer a sprint/iteration for a fixed price or on a T&M basis, and after each sprint the customer can decide to continue or to stop (which will require one final short wrap-up phase). This reduces the customer’s risk, which is often seen as a benefit big enough to outweigh the perceived disadvantage of not knowing what the overall cost will be. It’s great to be able to work in an environment where everybody’s goals are perfectly aligned, and this is the case in this model.

Unfortunately, this ideal model is not always an option. Of course one way for a development organization to ensure that all projects are done this way is to simply refuse doing it in any other fashion. That’s a good option, but whether it’s actually doable strongly depends on internal power distribution or external market forces.

But what do you do when you have to accept a different set of restrictions? For example, the customer/stakeholder might require a fixed-scope, fixed-time, fixed-price offer. My guess is we can all agree that this is bad idea for everyone involved. But how do you approach things if you just have to do things this way? What do you do if, as an additional downside, the developers assigned to the project are not as skilled as you’d like the to be?

As possible answer might be to use a classical waterfall approach, but I think this is never a good choice. At the very least, go with an iterative approach, even if that means you have to game the system to do that.

Of course you have to put up some effort into an initial up-front analysis. You’ll be aware that much of what you find out may actually turn out to be wrong, but it’s still better to make a slightly more informed estimate up front as opposed to a totally bogus one, especially if you’re an external partner that’s supposed to provide a fixed-price quote. Then, make sure that you grow the system in increments – i.e., build a first working system, using a number of interesting use cases; then add functionality in the next iteration, and continue until done.

Typically, this will resemble something like an agile process – but with slightly larger iterations (e.g. maybe 6 weeks instead of two), and with the added amount of documentation required to fulfill the typical waterfall requirements. (If this reminds you of a Unified Process kind of thing, that’s no coincidence.)

In the end, you’ll have created all of the documents and other artefacts required, but simply not in the order they were supposed to be generated (first analysis, then design, then implementation, then test), but with the trimmed-down focus of each iteration.

Is this perfect? Not even remotely. But in my experience, you have a far greater chance to meet your goals than with actually following the waterfall approach, and even more importantly, management is likely to accept it (partially because it’s obvious, partially because you don’t tell them about it).

If you can’t get away with that, you’re really out of luck, and it’s as they say: You need to change the company, and if you can’t, change companies.

Announcing “ROCA”

| Comments

In the past few days, we finally managed to write down a few ideas on Web frontend design, basically a set of rules to apply if the goal is to come up with a Web app that is actually on the Web as opposed to be tunnelled through the Web. We tried to come up with a catchy name, and finally arrived at ”ROCA”, a conveniently pronouncable acronym for “Resource-oriented client architecture”.

I am aware that for many folks, specifically those who are interested in REST and thus likely to read this, a common reaction might be “Duh”. And one of the main things I’d like to stress is that we have not invented a single thing, but rather collected a certain set of rules that we found many people liked, but couldn’t name.

Since we started discussing this, we’ve found strong support, as well as violent opposition. Which is exactly what we were looking for, because in only very very few cases, people didn’t understand what we described, and that’s the whole point of the effort: Give a name to a certain cohesive set of practices so that they may used as a reference both when you agree with them, want to build a system that adheres to them or criticize them because you disagree.

I’m looking forward to comments over at the discussion page. If you believe we should make changes, please fork the GitHub repo and create a pull request.

REST vs. Websockets, Oh My

| Comments

There is an entirely absurd discussion going on about “REST vs. Websockets”, predictably claiming that in the long term, REST will be succeeded by Websockets. This is stupid on many levels. I’ll try to be brief:

  • To be pedantic, REST vs. … almost never makes sense, as people are rarely talking about REST (the architectural style) in comparison to another architectural style. But never mind, let’s assume that what was meant was actually “RESTful HTTP” vs. “Websockets”, then …
  • Websockets is not something “more”, it doesn’t add something, it’s not dynamic, or interactive, or in any way “good” – unless you make the same claim about TCP. Websockets essentially allows you build your own proprietary protocols that may or may not be great, with all the typical advantages and disadvantages these end up having: possibly better performance, possibly better suited to the specific task at hand, less standardized, not widely implemented, etc. It’s not a case of one being better than the other, it’s about being different.
  • In the long run, HTTP (used in a way aligned with its architectural goals) will continue to have benefits for loosely coupling systems. If that’s what you want, it makes the most sense. If you’re after the most efficient communication possible, and are willing to sacrifice some of the loose coupling – fine, go ahead, use Websockets. But it’s not as if one will supersede the other.
  • Does this mean I claim that HTTP is perfect? Of course not, it most definitely could be improved. But if this improvement comes, it’s definitely going to introduce more, not less constraints.

PATCHing Rails

| Comments

As mentioned on the Ruby on Rails weblog, Rails 4.0 will include (optional) support for partial updates via PATCH, a change included to better comply to the HTTP spec and the REST architectural style. As you can guess, I really like this motivation, even though I think it’s insufficient to justify major changes – being “RESTful” should not be the goal, building a better system should be. It seems the Rails team has found a good way to do this, as the change is made in a backwards-compatible fashion (so if you don’t care, you can simply ignore it). But it highlights one of the things I really, really like about Rails: It tries to make it a lot easier to build something that’s RESTful than something that isn’t, and its reach means many more people will be exposed to this as the way it’s being done.

So what about the change itself? When should you use PUT, POST, PATCH? First of all, these are the truths I base my views on:

  • POST can mean anything; its most common use is to create something under a location determined by the server; it’s neither safe nor idempotent nor cachable; it should be used whenever using any of the other methods violates one of their guarantees.
  • PUT can mean creation or update; it affects the resource to which it is applied; it’s idempotent; it contains a full representation of the resource (as far as the client is responsible for it). Most importantly, by using PUT the client asks the server to store (in the widest possible sense) the representation under the location provided.
  • PATCH, a relatively new verb (at least in it’s standardized form) is intended to address partial updates, i.e. it updates only parts of the resource it’s being applied to; it’s not idempotent; the client asks the server to change parts of a resource.

If you’re a server developer and want to enable your clients to update only parts of a resource – say, a customer’s address –, you basically have three options:

  • POST the new address to the resource and have the server decide what to do – in this case, process the address change only – based on the content
  • Expose each part you want to be changeable individually as a resource in its own right and use PUT, i.e. make address a resource http://…/customer/:id/address that you can PUT to
  • Use PATCH to put information about the intended change to the resource itself, using an appropriate format understood by the server

Using POST is OK, but only because it essentially means nothing. The PUT option is perfectly fine, but requires you to explicitly create resources for this purpose. This is actually the best option in many cases, especially if the resources you create in the process turn out to be meaningful in their own right, support other methods (GET in particular). It often ends up feeling a bit contrived, though, so it’s nice to have the third option: Using PATCH means you are being very clear about the purpose of the request, and don’t need to create new and possibly otherwise unnecessary resources. It’s still fully RESTful because PATCH is an extremely generic method.

Note that while using POST for partial updates is OK, using PUT (as Rails does) is not, because it violates the behavior as defined by the spec. So changing it is a very good idea, and the only two options are POST and PATCH.

Even though PATCH is clearly useful, and has the ultimate REST authority’s blessing, my recommendation in the past has been to avoid it because you can’t count on anyone (or anything) supporting it, and go with PUT or POST instead. With Rails’ influence, I see this changing – and I very much look forward to being able to include PATCH in my RESTful designs in the future. Add PATCH (in addition to PUT and DELETE) to HTML 5, and I’ll be more than happy …

Blog Update

| Comments

I took the opportunity of a 10-hour flight to finally re-import all of my blog’s old entries into Octopress, fiddling around with a semi-automated process of Ruby and shell scripts and some creative Emacs macros (check out the archives if you care, even though I have no idea why you would). So if this worked, all of the old stuff should be preserved, even though most of it now only has historic value (and prevents Google searches to hit a 404). I’ve also taken the risk of moving the whole stuff back to its old location, and will redirect the temporary one back once I’m sure everything works. One less excuse to not take up blogging again …

XPath and XML

| Comments

Aristotle Pagaltzis has written a very nice and concise XPath intro. Of all the various standards in the XML ecosystem, I like XPath best, most of all because it enables interaction with an XML document or message in a way that matches Postel’s law: It’s a great way to implement code that doesn’t break each time a minor change is made to an XML format. In fact I’d say it’s one the few very good reasons to stick with XML instead of adopting JSON – even though there are things like JSONPath, they don’t have the tool support and standardization you get from XPath.

Media Types in RESTful HTTP

| Comments

A topic that comes up again and again in discussions about RESTful design is what kinds of media type to use. I’ve changed my mind about this a few times, often enough so that I believe it makes sense to write a bit about my current thinking on this. (A “media type”, should you happen to wonder what I’m talking about, is a specific format with a well-defined name, in this context represented in the form of a media type identifier such as application/xml or text/html or application/json. [That’s not 100% correct, but that doesn’t really matter here.] A “hypermedia format” is one that includes links or other hypermedia controls.)

There are a number of different ways to deal with media types when designing a RESTful HTTP system. One school of thought advises to stick with hypermedia formats/media types that are well-defined and widely understood, such as HTML or Atom. In other words: Whatever it is you’re trying to send around as part of an HTTP message, use an existing format, such as HTML, the main reason being that there are many processors that are able to understand it. Use the appropriate MIME identifier (such as text/html) in Content-type headers. One can make a strong case for this option: Hypermedia formats are hard to design, so you should avoid inventing your own.

But let’s assume you’ve decided to define your own hypermedia format, mike amundsen-style, whether by designing a completely new XML vocabulary, your own JSON structure, or some other approach: What MIME type do you use?

You can send content labeling it with the generic identifier, say application/xml. In this case, the MIME identifier will signal the technical format being used, while the semantics are only known to clients who either have some out-of-band knowledge or interpret the content itself. The rationale for this approach is that unless your home-grown hypermedia format is likely to be widely adopted, you’d better stick with a media type that is well-known, even though it doesn’t convey specific semantics. Duncan Cragg wrote a nice post on this a while back.

The second option is to invent your own MIME type, say application/vnd.mycompany-myformat, the argument being that you need to convey the semantics of the content to ensure a client, server or intermediary can actually know whether it’s able to process it.

This begs the question of how many different MIME types you’ll end up with. Instead of creating a new identifier for each type of content, (e.g. a customer, a list of customers, a list of orders), I’ve found that a good approach is to think of a specific context – a domain, if you prefer – that your format covers. I like the similarity of this approach to other hypermedia formats, e.g. HTML or Atom/AtomPub, where you actually end up describing something that applies to a set of collaborating participants, instead of some server-side interface. In my favorite example domain (order processing), you might end up with a MIME type of application/vnd.mycompany-ordermanagement, relate this to a particular XML namespace and define a few different XML root elements (order, order confirmation, cancellation, etc.). The assumption would be that processors can be reasonably expected to able to understand all of the elements in this context, not just one of them. (Of course the same reasoning applies when using JSON or something else, minus the namespace benefits/troubles, depending on your view of XML.)

One final approach that I find very interesting was mentioned by Jan Algermissen a while ago: If your format is based on an existing one, e.g. HTML or XML, your server can actually send the same content with different MIME types, depending on the client’s capabilities. A client that only included application/json in its Accept header would then get the content labeled application/json, while one that includes the specific MIME type application/vnd.whatever would get the same content with this label applied.

A Fresh Start … Again

| Comments

Google Reader’s extremely weird behavior after a first attempt to move my blog to a new location has made me try yet another different blog setup. This time, there should be nothing at all left from the old installation, no redirects, different URIs for everything, a new blog system – clearly, there’s no way for Google Reader to mess things up this time. Let that be the last thing said about this mess; even though it’s very sad to see all of those blog followers go who only occasionally drop by whenever something new appears in their Google Reader subscription, many of them may not have been following anymore, anyway. Who knows. It’s been a long time since I spent some serious time blogging.

Which doesn’t mean one can’t restart, right? We’ll see. For now, this is the one and only blog I’ll maintain. It has zero moving parts, i.e. it’s a static site, (currently) generated using Octopress. If everything works as expected, you’ll be able to subcribe to the an Atom feed that should actually be working for a change.

I’m somewhat undecided with regards to comments and the Twitter integration (both in the sidebar as well as in the bottom of each post). I’ve toyed with the idea of doing away with comments altogether, but I have to say that Disqus seems quite smart.