Friday, March 23, 2007

Five Nines in a Service Oriented World - Part 2(a)

I'm going to break this down into a series of posts on the topic as it just gets a bit big in one post. The first post will cover the basic principles and options, then the subsequent posts will detail those options.

So in the first article on five nines I talked about the challenge of SOA reliability and how the goal of reaching five nines really gets difficult when you are talking about distributed applications. I take on board one of the comments about getting the "right" information being the goal of an operating SOA and its that point that needs to be born in mind. The goal of operation is to deliver a system that returns results which are acceptable to its consumers at that point in time given the operational constraints that exist..

So first off lets start with the different options of solving the problems raised by Deutch's Fallacies. So sure some people will say "well duh!" but it worth stressing that there are different options to solving the basic problem of distributed reliability and those different options are applicable based on the differing business and technical drivers on your service. One size doesn't fit all. If you want a magic bullet, go and talk to a vendor they'll be happy to sell you a nickel plated one.

Distributed high availability is hard, the goal here is to help understand what type of "hard" problem you are facing.
  1. Plan for failure
  2. Understand what the "minimum" operating requirement is
  3. Understand the time critically of information
  4. Understand the accuracy requirement of information

Taking these one at a time they have some big impacts on how you design, build and support SOA environments. The primary challenge though is to accept a basic truth perfect operation is impossible if you strive for perfection then you will never deliver five nines, the goal of High Availability SOA is to deliver acceptable operation at all times

Technorati Tags: ,

Thursday, March 22, 2007

SOA Vendor Ratings - Q1 2007

It was back in May last year that I did my first assessment of the "main" SOA vendors, so first up I'm going to revisit that list, then I'm going to add in (and I really will this time) assessments from some of the smaller players and open source. As ever the following is my view, nothing to do with who I work for and can't be assumed to come close to reality etc, etc, etc.

This assessment has the same summary info as the previous one which means

1. IT Vision - What are they going to do in IT, implementation of applications, integration with backends, sort of the technical end of SOA
2. IT Implementation - Great powerpoints guys, but what about the products...
3. Business Vision - What are the doing for the business, what is the content and how will it work for the business
4. Business Implementation - as before, what exists beyond the powerpoints
5. Standards - SOA implementation is massively about standards, how much does this company implement and drive standards
6. Stability - How stable is the current product set and roadmap, will they be shifting strategy and leaving you in the lurch, or going out of business and doing the same

This time however I'm going to go a bit deeper on the actual vendor reviews. So first off here is the summary

Now you will notice that everyone has shifted quite a bit on the business vision side, this is partly because they have but also because I've broadened out the business side to include the operational challenges of managing SOA at the business level. Oh and you'll also notice there are two IBM assessments... which probably makes that a good place to start

N.B. The blue line is my assessment of "now", the redline is my prediction for where they will be in 3 years time.

So why are there two IBM assessments? Well the first one is based around the roadmap that IBM tell everyone, the one that still includes MQSI, sorry "Advanced ESB". The second one is based on what I think is the real roadmap and this comes from bitter experience of watching clients with MQ Workflow and WebSphere Interchange Server believe that they would continue as well. I don't buy the Advanced ESB line, and I don't buy the "you've got to use a proprietary product that is a bugger to install" rather than a single standards based platform and I really don't buy the "There are things that MQSI^H^H^H^HAdvanced ESB does that just can't be done in J2EE". Hence the reason there is the IBM assessment based on IBM
and my view on their "real" roadmap
Lets be clear here, J2EE is the way forwards for IBM. Having something that has a completely different development, deployment, management and versioning approach makes no sense and what is left that is important can't be done in Process Server et al today? A bit of multi-protocol support and the ability to do COBOL Copybook? The shame is that IBM do have in their J2EE based stack a really good set of products for developing applications. They are still pretty weak at the business and pan-enterprise level but they have added the registry and of course have one of the broader tooling suites out there. Oddly however this tooling support doesn't appear to extend to testing where the async testing support appears to be limited to JUnit, which isn't exactly great as JUnit is poor at async (as I know from testing an MQSI infrastructure using JUnit). With CBM they actually have a business modelling approach, but unfortunately that still looks like its considered "special" so isn't yet in the tool suite so everyone can use it. Good suite, good for applications, good vision (where it isn't subverted) but they are a bit weak across the enterprise they really need to start being honest around their roadmap so people can start planning for the Java based solution that is bound to come.
Well they've bought a few more companies and the Aqualogic and Weblogic brands are really beginning to take shape. They still don't have anything in terms of methodology at the business service level and this really is going to be an issue in the coming years. They've started talking about the situational applications (Aqualogic area) and this split of backend handling and business focus really does make a lot of sense.
They really need to beef up around the governance and testing side though, it really isn't good enough to have a "preferred" partner, its either "use what ever you want" or its "in the box". Testing especially is an issue, they don't have any async testing which isn't great for projects and future viability. The current messages around Tuxedo as a very expensive Adaptor for mainframes is also a bit odd, hence the knock down there. Great product suite, great stack, good split of business and technology, but they need to focus more around the operationals for SOA in the same way as they have previously done around the application server.
You really have to give the folks at Oracle credit, this time last year they had no ESB (except if you believe some of the analyst reports) and to be honest I thought it was going to take them a long time to get something that is properly separated. Sure there is the continued huge focus on "BPEL" as the answer to world hunger but there is certainly something coming together. This year is a big year for them as its the release of version 11 of the stack, with their membership of both the JBI and SCA/SDO camps its going to be very interesting to see the quality of what comes out in that new version.
Weak in the "business side" particularly around the modelling piece (and a great big EA tool is not the answer IMO) the operational side of the tool is okay but where they really shine out is around the testing, they actually have some async testing that can be linked back to a continual build, see it is possible. Integration is okay but a bit basic right now and the designer elements of the tool aren't really up to snuff from an SOA perspective. A good stack, an amazing rate of acceleration but its fair to say that there are still plenty of areas for improvement for the 11 AS release.
The gap between Oracle and SAP continues to widen in terms of the independent viability of the middleware stack. They've had some good thinking around the futures of all of this and the visioning is strong, the question is whether they can ever separate the Packaged application futures from the demands of the middleware, its a similar problem to the one that Microsoft have, but at least with SAP they are binding it to actual business value and business information.
Basically if you are doing SAP then its worth doing, and indeed its probably the only way, but if its a choice as a broad technology stack across the enterprise then this probably isn't the one you are looking for.
Will Sun deliver on the vision that was put forward last year, or will an EAI centric view of SOA emerge? There is lies the basic dilemma for Sun at the moment. They have a good EAI centric product in JCAPS (the old SeeBeyond stuff) and a great set of future tools (As demoed by Charles Beckham for me at JavaOne last year) the challenge now is to make that tooling shift while keeping the solidity of the underlying platform. At least with SeeBeyond and JCAPS its all based around J2EE so they don't have the mess that some others have.
The thing that knocks Sun down from an application development and operational perspective is that the current tools are very "me" centric, by that I mean that they assume that everything runs on JCAPS, the other knockdown is the debacle that is JavaSE 6 which really doesn't help the perception of Sun as a company that wants to solve enterprise problems. I really hope Sun bring it all together and start focusing up at the business problems where they currently aren't really involved at all. Great integration stack, really good for doing interfaces onto systems, needs to broaden out (using the tools that they actually have) into being an application stack and from there on towards the business.
Microsoft's progression around SOA since last May? Well they've released an operating system which has a proprietary async process model in it and they have a decent client side development model for web services.... Linking technology so directly to an operating system release is just plain bonkers, its as dumb as putting a Web Service stack into the JavaSE 6 release.

BizTalk remains the "heart" of much of the SOA messaging but its essentially the same product as 2004, which isn't great. Everyone else has moved on and it will be interesting to see if Microsoft come up with something equivalent to SCA, or even adopt it now its going into OASIS. With the Longhorn release due this year its really time for them to step up the focus around the enterprise and particularly improve their lifecycle and design support tooling. Microsoft Motion is a good business focused way of creating views on an enterprise, but unfortunately it appears too often to have been subverted into a "buy product" pitch. Either Microsoft want to play in the enterprise software space or they've decided that its not worth the effort, this year should outline which of those it is.
The ratings and categories explained
Now a quick summary on what the ratings actually mean, first off this is an assessment against what "perfect" would be today, rather than all time perfect (i.e. if someone stays at the state of the "now" then they'll always be a 5. The numbers are as follows
  1. Very very basic, not really functional
  2. Basic, meets some powerpoint and demo needs, but not much else, might be via a 3rd party to make it actually work.
  3. Can be used by the skilled
  4. Actually a pleasure to use and helps you move forwards
  5. Cooking with Gas
So really its an exponential scale rather than linear.

Now for the categories
  • BSA - Business Service Architecture, the ability to model the enterprise as services
  • BSB - Business Service Bus
  • BPM - Proper SOA and business centric Process Management
  • Registry - A service registry
  • Management - Ability to manage and configure operational services
  • Monitoring - SLA and monitoring of services and interactions, independent of the vendor
  • Testing - Testing of services at all stages of the lifecycle, especially async testing
  • App Design - Ability to develop applications that consist of multiple services
  • App Dev - Ability to develop services and applications that co-ordinate them
  • App Process - Application level process models (where BPEL sits) and its ability to work in a proper SOA way
  • App Model - The overall conceptual model of SOA applications that the vendor pushes
  • ISB - The integration service bus, getting things out of older systems
  • Adaptors - How easy is it to get things out of old systems
  • Int Model - Integration Model, the conceptual model that the vendor pushes for integration
  • Standards - How well does the vendor implement and support standards
  • SCA/SDO - How well is the vendor progressing down the SCA/SDO path
  • JBI - How well is the vendor progressing down the JBI path
  • WS-* - How well is the vendor at supporting WS-* (WS-TX excluded)
  • J2EE - How well do they support J2EE (standardised operating environment = lower support costs, no matter how much people bleat)
  • Roadmap Honesty - How well (IMO) does the published roadmap reflect what will really happen
As ever comments welcomed, particularly in this case in terms of what I should assess next.

UpdateTo be clear this is about the Technology vendors, for those looking to start SOA the this is the secondary thing the most important is knowing what the actual services should be.

Technorati Tags: ,

Five Nines in a Service Oriented World - Part 1 the problem

SOA systems are more prone to failure than traditional IT systems, SOA systems which rely on "the Web", HTTP or other network centric approaches are more prone to failure than those which rely on conceptual model and then implement locally. I'd say its staggering how Peter Deutch's 7 fallacies of network computing are being ignored in this latest technology driven approach (and I'm talking here about SOD IT rather than Business SOA) but unfortunately its not staggering, its exactly what is to be expected from an industry that loves to re-invent the wheel and where every 3-5 years it looks like yet another bunch of teenagers who "know best" are pushing the new "cool" approach. The teenagers are of course the vendors and the technology fan-boys.

But isn't the "Web" really reliable? Isn't SOA going to make systems more reliable? Errr no it isn't if we carry on designing s in the same bad old ways and just thinking that designing for distribution is the same as designing it in a single box. The Internet (as opposed to the Web) is reliable, but that is because it was designed to be, the Web wasn't.

The question raised at my meeting this week was "how many nines would a service need to have to support five nines at the system level" now the mathematically challenged out there (and I've seen them will do the following...
We need an average of five nines... but because of the "weakest" link principle its probably safest to have everything at five nines
And before anyone says "no-one would be that stupid" I've seen this sort of thinking on many occasions. So what is the real answer? Well lets assume a simple system of 6 services

To make this simple the central service A is the only one that relies on any other services and A's code and hardware is "perfect" therefore the only unreliability in A comes from its calls to the other 5 services. The question therefore is how reliable must they be for A to achieve five nines? There is a single capability that has to be enacted on each service and that capability results in change to the systems state (not just reads). The reliability of A is the combined probability of failure of all of the interactions:

So if we assume that all interactions are equally reliable (makes the maths easier) then to find the reliability of A we have:

Note that here we aren't including performance as a measure but it should be pretty clear that performance of a networked request is slower than a local request and that things such as reliability over a network have a performance cost.

Scenario 1 - WS without RM
In this scenario we have standard WS-I without Reliable messaging so we have two points of unreliability, the network and the service itself. This means there are ten interactions. Feeding that into the formula this means that each interaction (and therefore each service) needs to support six nines of reliability in order to deliver five nines for A.

Scenario 2 - WS with RM
Now using RM (or SOAP over JMS) we can partly eliminate network failure from our overall reliability so now we are thinking about give interactions which means 5 interactions which gives us a 99.9998% availability requirement on the services. Still pretty stunning (and a great example of diminishing returns).

Scenario 3 - REST no GET
In this scenario using REST its assumed that the URIs for the resources are already available and no searching is required to get to the resources, this means that returns from each service contain the link to the resource on the other service. This is exactly the same as the standard WS scenario so again its six nines required.

Scenario 4 - REST with GET
In this approach we are doing a GET first to check what the current valid actions are on the resource (dynamic interface) this means that we have twice the number of calls over the network which means twice the number of interactions. This gives a pretty stunning reliability requirement of 99.99995% for the services.

This of course is for a trivial example where we are talking about a single service which calls 5 services which don't call any other services. If we take some of the more "chatty" suggestions that are out there for the "right" way to implement services then we start on down a road where the numbers just get... well plain silly. If we assume that each of the 5 services itself calls 5 services each time its called then the impact goes up in Scenario 4 (6 x 20 interactions) to needing over seven nines of reliability in each of the services, to put that into perspective that is under 3 seconds of downtime a year. That just isn't sensible however so the real question with SOA is
How do I change my thinking to enable services to fail but systems to succeed?

Technorati Tags: ,

Tuesday, March 20, 2007

Two challenges for SOA and Web 2.0

I was presenting to one of our important clients at work this week doing my "what is SOA" pitch around Business Service architecture with another set of technical vendor assessments (due out soon) and the CTO from the client raise a few key points which I think are worth giving some air
  1. SOA and Web 2.0 aren't "simpler" in fact its a whole lots more complex, each generation of IT has created more complexity, and SOA + Web 2.0 is liable to be worse
  2. Reliability is a nightmare in a distributed service based world
To be honest its great to speak to someone who isn't drinking from the vendor cool-aid but it really does highlight the problems that are currently being glossed over. I've talked before about the challenge of async that everyone seems to ignore but the challenge is actually bigger than just async, its all about the increase challenges around versioning when you have more things to version, its about the increase challenge of worrying about networks and external services and still delivering a decent service.

I'll deal with point 2 in my next post, but the key here is that to make SOA deliver business agility it means that IT will have to deal with a more complex IT problem, that means there has to be a busienss value for the solution or you are better off putting everything in a mainframe.

Technorati Tags: ,

Wednesday, March 14, 2007

Church Turing reduction - the vendor way

I'm sitting at QCon today getting some work done and I've overheard a bunch of vendors brilliantly apply the Church Turing Thesis to their products. The bit they use is the "any solvable problem can be reduced to a previously solved problem" and the conversation goes a bit like this

Vendor: So what are you looking at?

Customer: We are currently trying to solve the grand unified theory of everything/build a web-site/integrate our ERPs
Vendor: Well of course the most complex part of that challenge is the clustering/framework/distribution/deployment/management and its that bit which everything else relies on to work.
Customer: Well isn't that just a technology piece of the puzzle?
Vendor: Oh no, its what everything relies on, its basically the central part of everything.
Customer: Why is it so important?
Vendor: Well without clustering/framework/flying monkeys/deployment/management then nothing else will work and you will have complete chaos
Customer: But my last project worked and I didn't have your product
Vendor: But did you have problems?
Customer: Of course there are always problems
Vendor: This would remove all of those problems
Customer: But the problems had nothing to do with what your product does
Vendor: Well how you see the problems isn't where our product works, but not having our product is what manifested itself in the issues you saw
Customer: Errr sure.... give me a leaflet then.... errr I don't have any business cards on me at the moment....

Its amazing how so many things are the "central" part of IT, hell the centre of IT appears to be so big its where everything is, like a great big black hole that sucks in all the light. Sure I know they are just trying to make a living, but do they really think that claiming to be the cause of the big bang is really going to help their sale?

Technorati Tags: ,

If I can't test my app, you don't have a product

One thing that is really beginning to irritate me with the pace of "progress" in Service Oriented Development technologies is how much is made of the new bell or whistle, and how little time is dedicated to making that facility actually operational. So we see a new tool that can only be deployed with an IDE, or a process engine with no async testing tool or a deployment process that has no audit trail.

What this leads to is great demos, bad projects and woeful operation. There really is no excuse these days to not have the basics done this means

1) Scripting of deployment, ideally with a supplied ant task
2) Test generation "answer", ideally as part of the suite, again this must be executable outside the IDE (e.g. from ant)
3) Security on deployment and logging

Having a "partnership" with someone who does these bits is fine, as long as its bundled and I don't have to pay more money for it. Having a "user" testing product isn't acceptable as that is UAT (the most expensive form of testing) and I'm after unit and system test.

If I can't professionally build, test and deploy my application on your product, then your product isn't professional.

Technorati Tags:

Sunday, March 11, 2007

Why trusting the consumer is dumb...

I've seen some pretty dumb things in my time but today, crossing into San Francisco over the Bay Bridge I saw what ranks right up there with the dumbest things I've ever seen. Its also an object lesson in why expecting consumers to behave in the way you want and not doing something different (whether good or bad) is a strategy that is doomed to fail. Any strategy that expects consumers to "remember" information correctly and not prat about with it is going to fail, any strategy that assumes that consumers will always do only what you want is going to fail.

Watch this video and remember... this is an object lesson in why your service (or resource) won't be used in the way you expect.

I decided not to embed the vid into the post as that seemed a bit anti social. So One car, one seat, one phone, one woman and her dog is best watched over at you-tube. Apologies for the swearing but it was a very big truck.

Technorati Tags: ,