Recently in SOA, Web Services and REST Category

REST Anti-Patterns

| | Comments (2)

When people start trying out REST, they usually start looking around for examples - and not only find a lot of examples that claim to be “RESTful”, or are labeled as a “REST API”, but also dig up a lot of discussions about why a specific service that claims to do REST actually fails to do so.

Why does this happen? HTTP is nothing new, but it has been applied in a wide variety of ways. Some of them were in line with the ideas the Web’s designers had in mind, but many were not. Applying REST principles to your HTTP applications, whether you build them for human consumption, for use by another program, or both, means that you do the exact opposite: You try to use the Web “correctly”, or if you object to the idea that one is “right” and one is “wrong”: in a RESTful way. For many, this is indeed a very new approach. […] As with any new approach, it helps to be aware of some common patterns. In the first two articles of this series, I’ve tried to outline some basic ones - such as the concept of collection resources, the mapping of calculation results to resources in their own right, or the use of syndication to model events. A future article will expand on these and other patterns. For this one, though, I want to focus on anti-patterns - typical examples of attempted RESTful HTTP usage that create problems and show that someone has attempted, but failed, to adopt REST ideas.

Let’s start with a quick list of anti-patterns I’ve managed to come up with:

  1. Tunneling everything through GET
  2. Tunneling everything through POST
  3. Ignoring caching
  4. Ignoring response codes
  5. Misusing cookies
  6. Forgetting hypermedia
  7. Ignoring MIME types
  8. Breaking self-descriptiveness

Let’s go through each of them in detail.

More on InfoQ.

My esteemed InfoQ colleague JJ Dubray once again provides a great example of how to avoid ever learning anything and piss off people in the process.

In what has become a tradition when I'm in the mood, these are my unedited notes from Werner Vogels's keynote talk "Web-scale Computing: Compete on Ideas, not Resources" at IIR's (German) Web 2.0/SOA/EAM conference in Wiesbaden.

  • everybody in the audience except two guys are Amazon customers
  • when you put something in your shopping cart, you don't want to care about the technical details
  • now: put off your Amazon customer hat, think of Amazon as a technology provider
  • shows example - subscription model for toilet paper!
  • "buy box" (the blue area) shows the best product for the customer - even if it's not sold by Amazon.com
  • being a platform provider means you have to be absolutely neutral
  • many other examples of websites powered by Amazon.com
  • some statistical data - 80M customers, 1.3M active resellers ...
  • retail, ecommerce (associates), infrastructure ws, enterprise customers
  • shows Amazon.com from 1995 - key idea back then: do something on the Web that you couldn't do otherwise (have all of the world's book in stock)
  • history: app server/database (1995-2001) --> service orientation --> massively scalable services
  • for one year, Amazon ran a mainframe DB
  • in 2001, the Web servers hit a performance/scalability wall
  • 2001-2004: services
  • now, everything is massive scale
  • the secret sauce of Amazon.com: not its recommendations, but its capability to do anything at scale
  • 1st step: modularization - co-locate data and the logic depending on it, no direct DB access anymore
  • now: ~1000 different services
  • a page will hit 250-350 services - even single lines, such as "sales rank", call a service
  • large services at the bottom (customer, product, offer) serve as indices to additional services
  • each team has a small team associated with it, responsible for building and running it - no separate operations dept
  • no better motivation to fix a bug than your beeper going off at 4 in the morning
  • software as bits as opposed to software as a service
  • one bug/one fix approach
  • the whole saas thing is a big lie! there's a big elephant in the room nobody's talking about
  • reason: between test and operate, you need to handle all the non-functionals - load balancing, scaling, utilization, ...
  • vendors have no idea how to handle these things b/c traditionally, the customers did it
  • most of the engineers' time was spent configuring router, managing load balancers, spending 70% of their time on undifferentiated heavy lifting
  • example: picture of AT&T data center built near a trailer park - which of course was destroyed by a tornado
  • 365 Maine downtown SF run 8 generators in their data center - three months ago 6 of 8 generators failed despite being tested --> most of Web 2.0 offline
  • Google study: 10% of disks will fail per year - w/ 80000 disks in a typical data center means 8000 disks fail per year -> you'll have people employed who only change disks
  • graph of target.com and walmart.com -> holiday peaks 2-3 times the rest of the year's average
  • lessons learned - offer 1000 wiis, 100000 people will show up
  • pitch for 37signals' "Getting Real"
  • Amazon.com web services: s3, sqs, ec2, simpledb, fps
  • was used internally for 2 years before it was offered externally

Scalability

  • growth by good customer experience -> traffic -> sellers -> selection -> lower prices -> customer experience
  • incremental scalability is key
  • being able to grow systems one step at a time
  • infrastructure needs to move from capital investment to variable cost
  • elastic: capable of growing and shrinking on demand
  • minimal disruption to customer performance
  • addresses: different growth paths, fault-tolerance, heterogeneity, operational efficiency
  • you can't assume your infrastructure is homegeneous

Availability

  • everything fails, all the time
  • somebody cuts a cable in the Suez canal - the rest of the world thinks India is gone
  • failures are highly correlated
  • things fail in groups
  • things don't fail by stopping - instead, systems fail by sending out large amounts of garbage
  • a load balancer sending to a machine returning very fast responses -- all 500s
  • let go of control - take a probabilistic approach: determinism doesn't exist in real life

Performance

  • engineering for performance for 99.9%
  • averages are irrelevant

Cost effectiveness

  • uncertainty
  • acquire resources on demand - you can't predict anything
  • release resources when no longer needed
  • the new economy is all about much intensified competition
  • don't rely on resources
  • the power of your success is now no longer in your hand

  • these four non-functional properties of large systems are dominated by state management

  • categorization of data access patterns

    • primary key access (high read volume, always writable)
    • query-based access (relationless + relational)
  • two large services: S3 and SimpleDB
  • EC2 with persistent storage for dedicated purposes
  • billions of objects in Amazon S3
  • the traffic out of Amazon's web services is larger than the traffic of all retail properties combined
  • availability zones
  • explanation of persistent storage for EC2
  • the big deal is: any type of legacy system can be run within the cloud
  • the only thing needed to get started: a credit card and http://aws.amazon.com (no contract, negotiations, ...)
  • (Question by yours truly: does AMazon.com use the services internally?)
  • Yes, extensively, given it's . If S3 ever failed, you'd notice it in Amazon.com (Question: is Amazon impacted by the peak loads it has to handle?)
  • Amazon.com scale is basically dwarfed by the platform it offers for others, it profits just as much.

Great talk, too bad it was this short.

Simon Harris:

The HTTP methods should be used to indicate the user’s intention without regard to the underlying implementation. The web application is an abstraction so we need to model the interaction on that abstraction. If the user’s intention is to make a change to something then go ahead and use a PUT but if they’re only reading some data use a GET even if you know it involves some database writes.

It may seem somewhat esoteric but spending a bit of time thinking about what the user’s intention is exactly has helped me better flesh out an application’s API.

I have the same experience: The CRUD analogy may help when you start “getting” REST, but you should leave it behind as soon as you can.

This may seem like a good idea at first sight -- a service that turns

http://servicereg.com/verb/{user:password@}domain/noun{/id?params}

into a corresponding GET, PUT, POST or DELETE against a RESTful Rails app.

Unfortunately, it has two major problems:

  • It defeats the whole purpose of RESTful design, which is to use the HTTP verbs according to their meaning
  • It basically assumes "RESTful service" equals the way Rails does them, which is one, but definitely not the only way

In a new interview, recorded at QCon San Francisco, noted Web services expert and open source developer Dan Diephouse talks about the benefits of using the Atom Pub and Atom standards for business applications, pros and cons of using REST, and upcoming features of the Apache CXF web services stack.

More on InfoQ.

HTML, GET and POST

| | Comments (5)

Bill de hÓra:

The enormity of the consequences of HTML only allowing GET and POST cannot be overstated IMO.  It's maybe the most damaging technical decision in the web standards space - ever.  I see HTML forms as a root cause for the WS-* "everything goes over POST" debacle, a billion dollar industry mistake, at best.

I've always wondered what's the history behing the GET/POST only restriction in HTML. Was there a good reason (or did something appear to be good reason) for doing so? I can't think of one.

Mark Nottingham:

While there’s a nice internal logic to mapping HTTP methods to object methods, it doesn’t realise the power of having generic semantics.

While I agree there's a lot to be improved in existing HTTP APIs, I'm not sure this is the most pressing problem. I'd rather somebody bring up some ideas on how to exploit the hypermedia aspects ...

Blogging in German for a change:

Die Vision klingt so wunderbar durchgängig und einfach: Wir betreiben eine saubere Geschäftsprozessanalyse – Top-down, mit der BPMN und natürlich ausgeführt von Business-Experten, die keinerlei IT-Kenntnisse brauchen. Es entsteht ein widerspruchsfreies Modell der tatsächlichen Unternehmensprozesse, so formalisiert beschrieben, dass wir daraus über einen einfachen Export BPEL generieren können. Das werfen einer Engine zur Ausführung vor, und diese orchestriert unsere ebenfalls vorhandenen - oder geschwind neu implementieren - Services so, dass nun alles automatisch läuft. Das ist natürlich völliger Unsinn, und zwar gleich aus mehreren Gründen.

More in COMPUTERWOCHE.de’s new aggregate blog on SOA and BPM.

These are my unedited notes from Jim Webber's talk at QCon London 2008. I might have to stop early as I have only some battery left ... we'll see.

  • Invented "Enterprise Manboobs" yesterday (defined as the big bloated ESB middleware -- it's so fat it has manboobs)
  • Schizophrenic on whether or not to prefer messaging or Web
  • MESTian with sympathies for RESTafarians
  • Web services, happily abusing HTTP
  • Business processes, hosted within services, communicating via messages
  • The XML fairy sprinkles pixy dust (which may in fact be crack cocaine) on your enterprise systems
  • In XML pixy dust land, messages are at the center
  • Except: Web services are evil -- because the messaging vision is as far out of reach
  • WSDL is the biggest pile of dog shit around
  • "WSDL is plainly shit. That's a technical term."
  • WS-HumanTask? WTF?
  • WSDL 2? Or WSDL too late?
  • WS-HumanTask? I value my humanity, I don't want to be wrapped in WSDL.
  • Not everything needs to be an OASIS standard. We know not to take a leak in public. (He said this )
  • WCF is the best of an extraordinary set of bad toolkits
  • Toolkits hide messages - WCF at least creates hope before forcing you back into the RPC mindset
  • Quick pitch for SSDL and Soya
  • Web Services could rock his world
  • WSDL has to die.
  • Sad that Sanjiva is not in the audience.
  • Things could be nice (SOAP processing model, message, loose coupling by default, composable model)
  • A lot of the WS-* stack can be ignored most of the time
  • "Tunneling is all a bunch of tree-hugging hippy crap"
  • URI tunneling and POX both treat HTTP as a transport
  • Some Web jihadists don't see this -- pointing explicitly at the Rails community (?)
  • Web Services tunnel SOAP over HTTP
  • Lots of Web people doing the same
  • Worse than SOAP
  • Example of methods with parameters exposed via URIs
  • very easy to understand, easy to code, pretty interoperable
  • BUT it's brittle RPC, tight coupling, no metadata
  • You can use GET and don't change any state
  • POX: uses XML in the HTTP request and response to move a call stack between client and server
  • Example: POSTing your credit card details into some porn site
  • Simplicity, interoperable, can use complex data structures
  • BUT: Tight coupling, no metadata support (unless you use a toolkit that supports WSDL w/ the HTTP binding)
  • RPC is commonplace today
  • To err is human, to really mess things up you need a computer
  • To really, really mess things up you need a distributed system
  • Bad Web services and Web integration have much in common
  • claimed end of rant (we'll see)
  • Web fundamentals
  • Bored lonely physicist Tim never intended the Web to be a middleware platform
  • Serendipity is great - don't let the RESTafarians tell you different
  • Roy must have thought: I can combine my love of porn surfing with my PhD
  • The Web exposes many of the characteristics we want - recoverable, loosely coupled, available, ...
  • Tenets for Web-based services (for the last time today)
  • Resource-based, Addressability, Statelessness, Representations, Links, Uniform Interface
  • Resource architecture - physical resources, logical resources, uniform interface, resource representation, consumer (web client)
  • Web URIs should be readable, not opaque
  • Use URI templates to make your resource structure easy to understand (S3 example)
  • Non-conformance and proud (google later)
  • Example: Good thing the Library of Congress didn't open up its contents with a link "print this" on every page
  • PUT is idempotent idempotent. That means you can do the same thing multiple times.
  • Points out all of the methods, status codes, headers
  • We have a comprehensive model for distributed computing ... but we still need a way of programming it
  • The Value of the Web is "linked-ness"
  • The same is true for the programmatic Web
  • Shows examples of links
  • The Web is just like Petri nets - links are state transitions
  • Microformats are an example of little "s" semantics
  • Innovation at the edges of the Web - not by some central design authority such as the W3C
  • Started by embedding machine-processable elements in Web pages -- e.g. calendar information, ...
  • With microformats, use the rel attribute to describe the semantics of the referred resource
  • "Subjunctive Programming"
  • With changing contracts as part of a resource, we can't be too imperative anymore
  • Doesn't have the cojones to call the Web declarative
  • Subjunctive preferred (what if I get a link to pay ...)
  • How to implement a typical enterprise workflow look if it's implemented in a Web-friendly way?
  • Starbuck's example -- it's a bit of a lie, as there is no happy path through Starbuck's
  • w/ Web services (MEST style), a conversation with a service via exchanging messages with it
  • The longest interaction you can have supported by WSDL is request/response
  • It's like a conversation with your demented granny who calls you "Billy"
  • Advertise it with SSDL or BPEL
  • What if this were modeled as resources?
  • Model workflow stages as resources, state transitions as hyperlinks/URI templates
  • First, do a POST to /order, Starbuck's returns 201 Created + a URI
  • In the returned representation, there's a link rel='pay'
  • If I make a mistake, I ask for my OPTIONS
  • If it says I'm allowed a PUT, I can update it (PUT)
  • Send an Expect: 100-continue (would it be OK?)
  • 200 OK, yes you can
  • So do the PUT
  • If it fails, do it again (it's idempotent idempotent)
  • as the resource doesn't remember interactions, use If-Unmodified-Since and get back a 412 Precondition failed if something is wrong
  • If I do the PUT too late, I could get 409 Conflict
  • A new OPTIONS call says it's readonly now
  • Following the payment link: POST to the order payment resource
  • New payment created, its URI returns
  • How do I know to POST? Via OPTIONS
  • Now if I get the order again, no payment link any more
  • Starbucks can have some resources that are private, but can also access the same resources
  • There could be an Atom feed of orders
  • HTTP auth can be used to enforce only Starbuck's can access specific resources (401 unauthorized)
  • Lessons: HTTP has a header/status combination for everything
  • APIs are expressed in terms of links, and they're cool
  • AtomPub is a blueprint to develop similar protocols
  • (Although the Atom guys will tell you Atom is all you'll ever need)
  • APIs can also be constructed with URI templates and inference
  • XML is fine, but other options like APP, JSON or maybe XHTML as a middle ground
  • Summary: both the Web and Web services community suffer from piss-poor patterns and practices and awful tooling
  • Both platforms are about externalizing state machines when done well
  • WS-* is bloated, but most of it can and should be ignored
  • The Web is now starting to feel the love from middleware vendors - beware!
  • Sends shivers up his spine - not convinced it needs the Java API love, no equivalent to JAX-WS on the Web
  • MEST and REST are both sensible approaches
  • Question: In the Starbuck's example, how realistic is it because of the limits of most Web servers?
  • Answer: Maybe not today, but a year or so from now ...
  • Question: Link with the payment - would you define some kind of schema for that?
  • Answer: Should the IETF have the only set of definitions? No, he would like to be able to link to a description
  • Question: Degree of freedom may end up creating problems
  • Answer: Things can appear to close semantically, but different in their implementation -- it would be nice if these things were consistent, but if they're not, so what
  • Some of the people who advocate the Web claim the Web is simple -- they're stupid
  • It's not easier -- creating a good system requires thought
  • Question: Should the contract be defined formally?
  • Answer: WADL (doesn't like it, similar to WSDL), just accept there are links, or create something new (currently happening)
  • Question: Doesn't the client have to be too intelligent?
  • Answer: True, this is tricky because the client has to be prepared for many things. Having to be declarative, we know we have to be ready for everything. There's additional work ...
  • Udi: There's intelligence involved -- there are programmers building stuff
  • If it were this intelligent, it would be like SkyNet (probably some W3C working group somewhere)
  • Caching in the Web means caching cheaply and getting it right
« Ruby and RoR | Main Index | Archives | Technical Stuff »