March 18, 2004

Responding to Dave Orchard's Take on MDA

Dave Orchard had some thoughts about MDA, and I commented on his blog entry very briefly. I'll try to address each of his claims in more detail below. I have elaborated many of the points in an older posting; it might be a good idea to read it if you're looking for some initial discussion.

Dave says:

The first lesson to learn for MDA is that "abstractions", like a logical model, typically worsen performance and there is a need for selectively optimizing performance.

First of all, I think a logical model is something that you just can't avoid having — if you design any sort of application, you'll have a conceptual model of your entities, their relationship, your processes, and so on, regardless of whether you actually draw this as a UML model in some fancy CASE tool, on a napkin, or just keep it all in your mind. You will think of he Customer having n Contracts, and each Contract referring to n Articles, or something like that; it's highly unlikely that you will think of your application's concepts in terms of the underlying technology.

The point Dave is trying to make is, if I guess correctly, that having a uniform way for mapping this logical model to a specific implementation strategy worsens performance. That's true, and it is a common trade-off: Uniformity will make your system easier to understand because things look more or less the same; it will not offer the same performance as the manually tuned solution that can take all of the specifics into account.

While I agree in principle, I believe that in every (complex) application, you will have some guidelines that describe the default "mapping". In most J2EE applications, for example, you will decide on one general strategy - such as "We will use EJB CMP" or "we'll use Hibernate" - and stick with it, except for the places where it's justified to deviate from your own standards.

As you will always have a standard way of doing things, an MDA approach will automate this, saving you time; it will localize the concept in a single place, so that it's easy to understand; it will increase quality, since you'll be doing it the same way automatically; and it will allow you to evolve your strategy should you find out that it can be improved.

Dave continues:

We had a bigger problem though: The developers hated it. Somebody would create the logical model, they'd push the "generate" code button, and then run the software. But guess what, they always got the model wrong. Maybe they forgot about the zip code in the address, or the middle name in the name structure. So the model needed to be updated.

So far, I see no problem — you of course don't get everything right the first time, so you'll have to update and change your model. I fail to see anything particularly tied to an MDA approach so far. The actual problem seems to be described in the next two sentences:

To make a simple change in the model and then generate took way too long. It could take up to half an hour before the system could be retested. The devs simply would make the change they needed in the place they needed it. For example, the SQL already had the zip code so they only needed to add the zip code in the Java and in the SQL select.

If it takes half an hour to do this, something is wrong with your tool set. There are generators that can generate hundreds of files in under a minute; depending on the CASE tool, the XMI export itself may sometimes take longer than the actual code generation step.

Even more importantly, in my experience the example given ("they only needed to add the zip code in the Java and in the SQL select") is an argument that clearly shows where the strengths, not the weaknesses of MDA lie: In a real-world architecture, you'll have to change the zip code in the SQL DDL, in the query string (as well as in the update and create statements); you'll have to add it in the place where you extract the data; you'll have to add it to some Java object, create getters and setters; extend some interface (e.g. in one of your EJBs); might have to add it to some FormBean, modify the code that copies it from some back-end object to the front-end object, etc. etc.

Saving time by doing this manually? I don't think so.

Next quote:

I assert that MDA systems almost invariably suffer from the "design documents collect dust on bookshelves" problem, despite best attempts of the tools and organization to stop the natural entropy

I assert that anytime you use modeling, UML or not, and you don't use an MDA approach, you will suffer from this problem (since the model just is part of the initial design documentation). With MDA, you make your model part of the development artifacts; it can't get out of sync (that would be like part of your source code getting out of sync). Of course you have to enforce this, but this is no different than disallowing people from modifying the byte code emitted by javac.

The technology simply introduces an artificial layer of abstraction that is too difficult to modify and build high performance systems.

By the same logic, I could say that EJB CMP introduces a layer that makes it impossible to build high performance systems. Or Java. Or maybe even an RDBMS. After all, not being tied to the relational abstraction will allow me to build a faster system - right?

Stripping down the rest of Dave's posting, I believe the key statements are

A final problem with MDA systems is that they don't really solve the hard problem. [...] A final problem with MDA systems is that they don't really solve the hard problem.

I totally agree; anybody who claims that MDA solves all problems, and turns it into a silver bullet, is clearly wrong. I believe that software development is hard; but it's hard enough to deal with even with an MDA approach. Without, it becomes unnecessarily harder - I see no reason to do this.

Dave's summary is:

In conclusion, I believe that MDA systems solve a small portion of systems development and will typically suffer from the "stale design" and performance problems. I think the path forward for software development is perhaps to use MDA as a prototyping exercise, but the real productivity gains will come from ever increasing productivity tools (like better GUIs, APIs and programming models) and increasing metadata.

As I hope to have made clear, I strongly disagree with the first part of this statement. Using MDA for prototyping only clearly misses the real benefits — and in my view, MDA is only about increasing productivity, nothing else.

I'm not a fan of statistics, and I don't particularly believe in specific case studies being generalized to an overall productivity gain. But I know from experience that especially in scenarios where you have a big code base, a strong set of architectural guidelines, and an application that is more data-centric then process-centric, the productivity gains are significant. Especially with EJB/CMP-based backends, the percentage of generated code (for this layer) can be up to 100%.

March 11, 2004

DSM vs. MDA

Michael Platt has a well-written explanation about the differences between MDA and the DSM approach favored by Microsoft. I agree with some of the criticisms, while I still hate to think that we'll be left with a new Microsoft-vs.-Rest-of-the-World battle.

March 10, 2004

Whitehorse - Microsoft's Modeling Tool

Microsoft's Keith Short has some information, including screenshots, about Microsoft's new modeling tool, Whitehorse (some more information here, here and here). Very interesting, although the comments about keeping code and model in sync remind me of other flawed efforts.

March 09, 2004

Web Services: Avoiding APIs

I've spent some time thinking about the pros and cons of using APIs vs. using standardized wire formats, prompted by a whole bunch of references to this issue - by Don Box, Dave Orchard, and Sean McGrath. It's becoming more and more clear to me what the advantages of standardizing on the wire formats are, so I thought I might as well share some of my ideas. Feedback, as usual, is very welcome.

So what's the issue? I believe that when you take your first look at Web services, and have a strong background in Distributed Objects, or other, older RPC-based technologies, you get a strong feeling of déjà vu. After all, isn't this all just CORBA or DCOM reinvented, with a different wire protocol? What could be the advantage of using a fat, text-based format instead of a nice, binary, efficient protocol like DCE RPC, IIOP, or plain RMI? Why would anyone in their right mind even care about what the bits you send over the wire look like?

You might be tempted to support Web services standards, most notably SOAP, just as another, additional transport. Create a common API, and use it to send messages via RMI, CORBA, or JMS - without having to change anything in your application code. Have custom transports, and plug them in as needed. Isn't this the best possible strategy?

Unsurprisingly, I think the answer is: no - on the contrary, it's the worst thing you could possibly do.

First of all, there is an architectural difference between the way applications should be designed for tightly coupled vs. loosely coupled interactions. This is mostly related to the granularity of your services, as opposed to the granularity of your components' or objects' interfaces. While I believe this argument to be strong, it's not really related to the technology being used - you can just as easily build a loosely coupled system based on, say, JMS. In fact, I have seen customers do exactly that - independent components on a common bus, with the ability to add and remove components that listen and send to specific topics or queues, allowing for great flexibility in system evolution.

But there is another aspect, and a downside that seems to be less clear: Standardizing on the API doesn't buy you very much in terms of interoperability.

Let's say you standardize an API used to access services (or objects or whatever) that reside somewhere in your infrastructure. The effect is that you have a high interoperability between the partners that use the same API. You can change the API implementation, and everyone using that API will be interoperable again - in the cases where you use dynamic linking, this will happen instantaneously.

But what about others? What about third party products, or products developed in different departments of a large organization? Do they use the same API? Are they even developed in the same programming language, and if not, do you have an implementation of the API for that programming language as well? How sure can you be that even if you have one, it's in sync with the other implementations?

With a protocol like IIOP (which is what CORBA and J2EE use for transport), there is simply no way you could standardize the message on the wire format level. There's no easy way to describe it, and the only way to make sure everybody can interoperate is that you use the same CORBA version and 100% compatible implementations. Of course, CORBA interop has become a lot better in the last few years. But problems always occur when the underlying format needs to be changed - as is the case e.g. for transactions (with the need to propagate a transaction context) or security.

The beauty of SOAP - wow, that alone should have somebody flaming me - is that it actually makes it very easy to standardize on the format level. XML in general, and SOAP headers in particular, are very easily extensible. Basing your interop standards on a certain kind of SOAP message, including standards-based and non-standards-based headers, yields interoperability on the wire level. This in turn, enables a C++ app to talk to a Java or C# one, and if there's anything e.g. in the SOAP message's header that is specific to a certain type of application or interaction, implementations that don't understand this header can simply ignore it. With the level of support from a standardization perspective — after all, other people are likely to experience the same problems that any given organization does, so it's likely there'll be a common standard at some point in time —, and with more and more applications that provide SOAP interfaces out of the box, integrating applications becomes an order of magnitude easier.

February 08, 2004

Platform Independent Models, Platform Dependent Models, and Code Generation

I got into a very interesting discussion a few days ago: Where exactly can you find a PIM and a PSM in both in the MDA vision as such as well as in existing tools such as iQgen. My view on this issue is:

In MDA:

In "Pragmatic MDA" or "MDA light" or "MDA as it exists today" ...

So is there a PSM in the second approach? If so, where?

In my opinion, the PSM, if you insist on finding it, is the code itself. How can the code be a model? A model, in general, is a simulation of some concept or aspect from a specific point of view, intended to enable reasoning about it for a particular purpose. If I take the resulting code in the second approach from above, I can look at it in an IDE, the purpose being to edit in its textual form. If I read the same code into a 1:1 CASE tool such as Together, I see a visual rendering; the purpose of which might be to manipulate it visually or explain it to somebody else.