Biology of Distributed Information Systems: 2009

Saturday, September 19, 2009

New Shared Document

For reasons explained in my other blog, I am keeping quiet for a while. However, I have added the "Shared Documents" gadget on this blog and added a new presentation about SOA and BPM.

I gave this invited talk at the SOA & BPM IDC Conference on September 17th.

As usual, it is offered under creative commons rules.

Saturday, July 11, 2009

Kolmogorov and the measure of competitive value

I was fortunate enough to attend USI and, although I could not participate to all the sessions, it has been quite fruitful. The (summer) "University of the Information System" is organized by Octo, BCG, le Monde Informatique and TV4IT. It is a great gathering for "bosses and geeks", with lots of opportunities for networking (and meeting old friends as far as I am concerned) and learning exciting stuff (the list of keynotes is amazing).

1. Complexity

I'll start with the key idea that attending a brainstorming session moderated by Luc de Brabandere generated. It may be stated in a pompous manner as:

CompetitiveValue(IS) = f(complexity)
The competitive value from Information Systems is a function of their complexity (in the Kolmogorov sense)

It starts as follows: what is not complex is easily reproduced and become a commodity, something that anyone can use and that may not, therefore, seen as a competitive advantage. For instance, the chisel in the hand of the stone carver is such a commodity tool. Although it is crucial to the task, and is taken great care of, the chisel is not a differentiating factor. Anyone can get a great chisel. What makes a great stone statue is the talent and the craft from the hands of the stone carver. For those companies for which IT is a differentiating factor, there is a fair amount of complexity that has been mastered, from a size, a technological or a business integration perspective.

This is actually very close to the concept of information measure as defined by Kolmogorov. Let's recall that Kolmogorov measures the complexity of an information sequence as the size of the smallest program that can generate the sequence. Anything that is very rich but generated from a set of few rules has a small Kolmogorov's complexity, while chaotic and random structures have a high Kolmogorov complexity. Here the measure of the information systems is precisely what cannot be reduced to a set of rules and a few enabling technology. If you have a large information system which is using standard tools, standard techniques in a usual manner, its complexity measure will be small. If you have an information system that is uniquely tuned to the business, where the practical know-how built over years has helped to resolve technical difficulties, its complexity measure is high.

This approach is strikingly compatible with Nicholas Carr's position on "IT does not matter", that is, IT without complexity is a commodity. One must read the original article or the book to see that Carr is talking precisely about the competitive value of information systems. His vision, which is fairly optimistic in its timing but generally accepted as target architecture, is that Web Service mash-ups will transform IT into a commodity. This Web Service / Cloud IT will not be without value (as necessary as electricity) but without differentiating value. I disagree about the availability of this “commodity IT” (cf. my book “Information Technology for the Chief Executive”, whose second chapter talks about N. Carr’s position), but I definitely agree with the (obvious) statement that there is no differentiating value with a commodity service.

One could say IT without complexity does not exist, but it is not true. Software as a service, for instance, is clearly one direction to remove a fair amount of complexity. Really simple IT exists; unfortunately it cannot solve all problems. At the end of the day, there remains a lot of complexity, irrespective of the technology or the procurement options that are chosen (cf. the Web site of the SITA: Sustainable IT Architecture). This is why companies need a CIO in the first placeJ. For me, the first job of the CIO is to manage complexity. This includes:

Reducing complexity through an Enterprise Architecture approach,
Removing complexity whenever possible, that is empower users to manage their own information system;
Taming complexity through collaboration and training

The paradox is that the CIO’s mission is to constantly reduce the perimeter of her/his job. But since the mission of any enterprise is to be smarter than its competitors, new challenges keep been thrown at the CIO …

Cloud computing is about complexity “de-materialization”: the complexity does not vanish (cf. SOA is not scale free), only its nature changes. If the IS is managing a complex set of business processes with a lot of service interactions, rapid changes and performance constraints, moving the IT on the cloud does not make the complexity disappear.

Back to this idea of Enterprise Architecture, the challenge is to produce a flexible architecture which may sound as an oxymoron. More precisely, one must create structure without rigidity. This prompts a few suggestions:

Be wary of “invariants” - there are few of them. So-called invariants are traps to install rigidity (for French readers, this is a topic covered in my first book)
Reference designs are living objects. Architecture relies on a number of reference designs: data models, service catalog, integration framework, etc. Any good Enterprise Architecture methodology will tell you how to build extensible designs. It looks like design violation is closely associated with innovation (?)
Diversity is key (a theme from the second day of the USI, I’ll be back in a moment).

This line of thoughts brings us back to biology and the general theme of this blog. In the living world, “invariants” (building blocs) are small and they are versatile (better than flexible). As explained by Albert Jacquard during his magnificent talk, diversity comes from reproduction, hence from randomness.

2. Uncertainty

The brainstorming session which generated all these ideas was about uncertainty. How to live, how to create, how to be relevant in a uncertain world?. Luc de Brabandere used a different set of scenarios, which are somehow similar to the four scenarios of Dan Rasmus in his book “Listening to the Future”.

I am a big believer in this approach to define a proper strategy for information system (cf. previous reference to the second chapter of my book). A scenario is not forecast, it is a virtual situation designed to foster creativity. This is a key point: the scenario’s value is not to be as close to what will happen in the future, it is to help build skills that will prove useful in the future (for the information systems or for the employees). In a world, the goal of the “scenario exercise” is to develop one's situation potential.

I have developed a “theory” over the years (cf. my other french blog), that the only tool to master uncertainty is gaming (as in "serious gaming"). Games are based on virtual scenarios but may develop true skills or help better understand the possibility of the future. I will return to this idea in a future post. What came to me as a conclusion of this afternoon session is that “Participants must participate”: passive viewing is of (almost) no value. This is deeper than it looks: since the scenario is not interesting per se, the value is the thought experiment and the collaboration that occurs between the participants while they play with the scenario. If a summary is proposed to a set of external listeners, most of the time it sounds dull or strange. I came to express this as a communication rule: if you need to report the result of a scenario-brainstorming session to your managers (or some other managers), it must keep the form of a role-playing exercise where the audience is actively engaged.

3. Value

This first section of this post dealt with “differentiation value”. What about “regular” value? The classical issue of the value that is produced by information systems was, as one would expect, central to this year’s USI, with one dedicated session on this topic. A good reference on this topic, by the way, is Amhed Bounfour’s book “Organizational Capital”.

That session started with an outlook on the issue, stating that there is neither consensus nor any method that would be applicable to the whole spectrum of issues (I agree, I wrote as such in the previously mentioned bookJ). Octo’s proposed approach is to define a “usage value”, very similar to Adam Smith’s definition: the value of a component of the information system is the additional amount of time it would take to perform the task without this component. It is expressed as a monetary value and actualized (over a given amount of time, such as the life expectancy of the software component.

It is a convenient measure, because it is easy to understand and relatively easy to evaluate, at least when an order of magnitude is concerned. Obviously the value must be capped by the total amount of money generated by the associated business process, in order to cover activities that would not exist without an IT platform (e.g., something that would require a million hands and generates little value). It has a few nice properties: it takes the quality of service into account, as well as the true deployment of the component. A beautiful application that is almost never used has a null value with this approach, a desirable property which is not true of all methods!

It is also a shortsighted measure, proving once again that it is hard to conciliate all objectives (cf. the introductory point). This measure does not take the future into account and how the information system is ready to embrace change. One could argue that “usage value” could be made future-oriented with a scenario approach, following the tracks of the previous section … and that’s true but that’s hard.

Adam Smith’s definition was quoted one day earlier by Daniel Cohen during a great evening speech. He talked about Philippe Askenazy's work on the Solow’s paradox (the absence of evidence of productivity gains due to IT). Philippe Askenazy, through a careful study though a large sample of data points, was able to show that IT can bring value only in conjunction with re-organization:

Value = Information System + Re-engineering

Those company who decided to reorganize themselves as they introduce IT in their processes showed significant returns after a few years, while those whom embraced IT but did not change, had nothing to show but costs. This is a great piece of evidence since it supports a claim easy to understand for any CIO: IT revolution only works if used as a lever to re-organize and re-optimize work (hence the importance of business processes).

I will conclude with a great idea from Pierre Pezziardi and Laurent Avignon: foster innovation through opening new territories, places where anyone can contribute to the information system. Obviously one cannot allow anyone to touch anything anywhere, hence the concept of “new territories”. This would be a “zone” where almost anyone can make a contribution (a piece of software) that may be used by others. There is a lot of value there:

Foster creativity and innovation
Improve the image of the information system, from a dark mystery to a friendly tool J
Support collaboration and engage a dialog with users
Find all the talents that hide in the company (i.e., not in the IT department) and who could contribute
Introduce diversity: the peaceful coexistence of safe, structured islands with active (even chaotic) agile platforms

For instance, one could open a few web services that give access to the heart of the information system (in a read mode for a first experiment J) and pick a sand box (such as Excel, Microsoft, Salesforce.com, Google App Engine) that is exposed with a SaaS (Software as a Service) philosophy. The ease of deployment (and adoption) is crucial here to make this a meaningful event. This should turn into a big innovation contest!

The general theme of Pierre and Laurent’s show “L’informatique conviviale” (it was nicely executed on stage in a very lively manner):

Diversity is necessary, Pleasure is key.

The first point is well developed in OCTO’s book “Une politique du Système d’Information”. Their dialectic analysis is too rich to be reported here, but may be summarized as follows: there are too many conficting constraints on the information systems to adopt a unique set of policies. Different parts of the information system must be governed with different approaches, to reconcile the needs for innovation, agility, safety, performance, reliability, etc. This “zoning” of the information system requires clear “passage rules” to support the exchange flows between the different zones.

The second point was the heart of their talk: build an information system that brings pleasure to the users and pleasure to the developers. This is a great insight since biologists tell us that there is no learning (continuous improvement) without pleasure. I will conclude with something that I learned a year ago at a conference about complex systems, on the process of learning for all living organisms, from the smallest to the most complex (us).

I had learned ten years ago about the PDCA cycle of continuous improvement from Deming: Plan, Do, Check, Act. There is a similar cycle that nature has invented for learning: Desire, Plan, Execute and Please. Desire creates the will to plan, to formulate a goal (for conscious livings), to get ready. The execution yields pleasure, which strengthen the desire and reinforces the cycle. I believe that pleasure is an integral component of corporate/collective learning and continuous improvement. Pleasure can take many forms, from pride (in Japan) to simple fun (in a Silicon Valley's start-up).

Sunday, June 7, 2009

Sustainable IT Budget in an Equation

This post is going to be technical, if you are allergic to math formulas, I suggest that you skip this or download my talk (in french) about IT value :)

The topic is the search for a sustainable IT budget, a thought that occured to me during the SITA conference on April 30th. I figured that I could get some characteristic equation of a sustainable state as a fixed point of the IT budget equations. I have spent quite some time (over 10 years) modelling IT costs. Some of it may be found in my last book, it is also a key part of my Polytechnique course. This course is focused on Enterprise Architecture, following the analysis of the CEISAR, but is also heavily oriented towards economic analysis (cf. the previously published bibliography).

Before I dive into the technical analysis, I'd like to stress out that I am not looking for a stable information system. Information systems are living objects, with new parts beeing constantly added and (hopefully) other parts being removed. What I am looking for is homeostasis, that is a property that is verified by the IS considered as an open system, even though the parts are changing constantly. Here, the property that I consider is that the budget grows at the same rate as the turnover of the company.

What I did is to take the cost model that is published in my book, and simplify it to reduce the number of necessary variables. The biggest simplification comes from using acquisition costs as opposed to function points to measure the application portfolio. I won't go into details today, it actually makes sense.

The budget is defined as the sum of the project budget P and the Operation budget O. Project are used either to add new applications or to improve/maintain/renew/update existing ones. The two ratios that are of interest (based on what on hears in IT conference or benchmarking) are:

R = (E / E+O) = the % of the IT budget spent for projects (we also use r = E/O)
n = Pn/P = % of the project budget spent on creating new applications

The model that I use is straightforward:

each year, Pn € is spent acquiring/building new applications
a given percentage (d) of the application portfolio is removed/destroyed
(P - Pn) is spend renewing a fraction of the remaining apps, so that the average life expectency of apps is A (measured in year, or we can say that the renewal rate is 1/A)
We suppose that the operation cost for an application that costs 1k€ is w k€ - A typical value is 25% (once again the references may be found here).
We also suppose that there is a productivity gain over the years of p%. That is, until the application is renewed (or discarded) its operation cost decreases by p% each year.

The IT budget is considered stable if both P and O grow at the same rate g as the turnover (such that r is also a constant). Looking for a stable state yields three equations:

stability of the application portfolio (dS/S = g)
we get: w* = g+d / nr (weighted operation cost, smaller than w because of p)
renewal rate of the apps (P - Pn = S/A)
we get n = 1 - 1/(w* Ar)
stability of the operation expenses (dE/E = g)
we get a huge formula:
dE/E = [(1-p)(1-d)(1-1/A) - 1] + (w/w*)(1-d)/A + wnr

The math, although high-school grade, is a little tricky. I implemented an Excel version of this model, in order to get a first hand opinion about the behaviour of this simple model and also to check that my equation algebra was correct !

The good news is that I managed to get it right eventually, that is, the equations may be resolved to express r and n as a function of the other parameters. Here are my two theorems (I will write a paper with the proofs one day):

n = [A (g + d) ] / (1 + (g+d) A)
r = [g +d +p +(1-d)/A] / [w (1 - d + A(g+d))]

The second formula is approximated (I left aside a few "second-order" subterms) but is still quite accurate when compared with the real value from the simulation. Note that if you suppose that g = 0 (i.e., a stable IT budget for a stable company), the equations become even simpler.

What can be derived from these formulas ?

First, they actually provide useful values. If w = 25%, d = 5%, p = 4%, A = 10, we get R = 42% and n = 33%. For someone who has been a CIO for many years, these values make sense. It is important to realize that if n is, say, 50%, then the IS is unstable and will grow !

Second, they show something that I have claimed for a long time : you cannot compare companies using R as a benchmarking ratio if you do not know A ! I have heard so many hours of bullshit SOA infrastructure presentation, which tell how to increase R without taking the proper parameters into account (this is actually why I decided to write my first book: I was fed up listening to consultants who had no clue what they were talking about). I heard the same mistake made by experts (at the VP level of large US software companie). Hence I am glad to have such a compact counter-argument, something I have been looking for over ten years.

Last, the r formula gives a good summary of what can be done to increase the amount of money that can be spent on projects:

clean-up old applications
increase productivity for operations

Nothing new here, but it is nice to see that age-old sound advice (often ignored) may actually be proven.

Saturday, May 2, 2009

SOA : A Tale of Two Cities

This post is an attempt to illustrate a key message: the benefits of SOA are dwarfed by the inconvenient of not doing it. When I write SOA, I mean the combination of MDM (Master Data Management), EA (Enterprise Architecture) and Service Architecture on a global scale. This has been the topic of many of my posts, so I won't go into details. I won't detail the expected benefits either (cf. the previous post on SOA's presumed death) : cost reduction, increased agility, sharing and complexity reduction.

What is clear for everyone is that this approach has a cost. It can be a large set-up cost for a first project or a moderate "architectural investment" if SOA is a sustained practice. I have witnessed the two alternatives when a large-scale new project is launched: with and without such an effort.

It is now clear for me that the true difference in the outcome is not the set of expected benefits of a "disciplined/architected" approach, but rather the unexpected benefits (what one could call the strategic agility) and even more the avoidance of major problems in the future life of the IT system that is being built, mostly with respect to data integration.

Separating between strategic and tactical agility makes sense. Tactical agility may be defined as the ability to make the "easy transformations" to the information system (IS) as easily and cheaply as possible (they go hand in hand). Strategic agility is measured by the cost of making "the hard changes" (those that are deemed "impossible" the first time the need is mentioned). Easy vs. hard is both a matter of anticipation (strategic agility is the ability to move the IS towards a brand new direction) and scope. Most of the technology, such as middleware, is geared towards tactical agility. It helps to implement "reasonably easy changes" faster (sometimes much faster). But what helps to "turn around the ship a full 180 degrees" is the hard work on architecture (mostly, data architecture and then, service architecture). See the recently posted bibliography for more pointers. Strategic agility is difficult to evaluate, but not impossible. Playing with scenarios seems to be the best approach. See my book, or "Organisational Capital", edited by Ahmed Bounfour. For the lack of a better word, I will call structural agility the "status of being able to avoid major problems" through the taming of complexity. When complexity is not curved, unseen consequences start to happen. This is when we see huge overruns in budget and time, or even complete failure of large projects.

I am always worried when I see a "core-centric" project, usually motivated by the introduction of a new technology. A "sacred alliance" occurs between the client who sees the new technology as a quick relief of a precise pain, some itching that has been going on for a while, and the IT folks who are always happy to try on a new thing. New technology here may be a rule-based engine, a new database techno, some learning/recommendation engine, a new Web/Interface generation tool, etc. What I mean with "core-centric" is the focus on the core "new thing", the core "new benefit", as opposed to the edge: the integration with the rest, the way the new system interacts with the old stuff. Core-centric projects always follow a successful proof of concept. When the focus is on the core, it is actually hard to fail a proof of concept … I have, obviously, nothing against proofs of concepts. They are clearly necessary, there is no reason to work on the hard stuff (the edge) if the core benefits is not worth it. But one must keep in mind that a successful "proof of concept" is theeasy part. A common plague of core-centric projects is that they are often "designed by committees". This will be the topic of another post … an IT component need to have on clean business customer (a physical person) and one identified architect.

Core-centric projects tend to start well and to end in misery when adding the last 20% of data integration seems to cost 80% of the effort. The sad thing is that one cannot buy an enterprise architecture from an outside vendor (although help/consulting works J). A "turnkey" project, even with a high quality supplier, delivers exactly what you pay for, but no more. Precisely, the level of integration is what is defined in the specification. Extensibility, the ability to add a new data source, to take a new business practice into account, to adapt to a new competitor… are why architecture is necessary before adding a new IT component. One cannot expect to get this kind of "forward thinking" from an outside vendor. ISVs cannot, as a general rule (and each rule has its exception) carry the weight of the integration issue. Another way to say this is that integration should be "inward-bound" and not "outwards-bound": what matters most is outside the new project, not inside. This applies especially to SOA: it is much easier to define the services that a new component may expose than the services that it may use.

IT Strategic Alignment, a buzzword of these last 10 years is indeed a difficult exercise, because of the dynamics of the target – constantly moving - and the inertia of the IS. Enterprise Architecture is about system dynamics and trajectories. Aligning over the "strategy" is necessarily difficult because the target is vague and shifting its shape continuously. The fast movement of the target coupled with the slow speed of IT transformation means that both anticipation and abstraction are required. Anticipation is necessary because transforming the IS takes time. Even within the framework of a sustained SOA effort, herding the set of services towards the desired service architecture takes time. Abstraction is required to filter out the "micro-variations" and to focus on the key long-term changes.

Let us pick an analogy: imagine that we want to wrap objects of various shapes that are given to us randomly, with sheets of a rigid material. A good example would be statues, since they exhibit very different forms. To prepare beforehand, we pre-fold the sheets. Obviously, the objects represent the business opportunities … while folding the sheets represent undergoing IT projects to fit the opportunities. The preparation beforehand is similar to enterprise architecture: a little effort in advance to speed up things when the real problem occurs. However this preparation only helps if the folds are actually useful to wrap the new shapes. It is actually an interesting analogy since finding a set of versatile preparatory folds is a hard problem.

The following set of illustrations is taken from wrappings by Christo. They are not the best illustration of this fictitious example (no pre-folding since the wrapping material is not rigid) but they look nice J

From this analogy we can pick three key principles:

One cannot do the architecture without knowing what "the future holds" from a business perspective : Service Architecture is about business
Going from the shape to the folds is tricky : Architecture is not for "dummies", it requires thinking and abstraction
One can overdo it and spend more time solving the puzzle than solving the business problem. It is easier to wrap a complex form than find the optimal set of folds that would help wrapping the most.

It would be easy to transform this post into a fictional story to contrast two approaches for a new complex IS project, with and without a SOA investment. I might do this as a new "Caroline story" for a new edition of my book, in a Tale-of-two-cities' style. So, to return to the initial question, what could motivate Caroline to "do it right", if it means a slower start and an increase in the total cost ? Purely defensive arguments are hard to sell, even if perfectly correct (e.g., a higher cost estimate but a higher probability of avoiding overruns). This brings us back to the concept of "situation potential", borrowed from Chinese strategy and which I have mentioned earlier. This is truly a powerful idea, worth yet another post; it unties the Gordian knot that mixes complexity, the difficulty to forecast and the different time scales. It may be seen as the combination of tactic agility, strategic agility and structural agility. Being able to demonstrate and sell the increase of "situation potential" is what it takes to develop a long-term Enterprise Architecture effort.

Friday, April 24, 2009

Selected Bibliography

Here is a short selection of my favorite books about IT and Enterprise Architecture. I will try to update and develop this selected bibliography in the future, and I am always on the lookout for additions. A more detailed list may be found in the bibliography section of my last book, but this is a "selection from the heart".

The first list I have assembled for my course at the Ecole Polytechnique. These books are both inspiring and reasonably easy to read J

R.J. Wieringa, "Design Methods for Reactive Systems: Yourdon, Statemate, and the UML», Morgan Kauffman (2002)
A really great book about design methods, with both a lot of structure (theory) and practical insights from the domain of reactive systems. Very relevant for complex fields such as telecommunications.
P. Roques, "UML 2 en action : De l'analyse des besoins à la conception », Eyrolle (2007)
This is not a reference book (there are better books to learn about UML) but this is the best book I know to understand how to generate value from the practical application of UML.
P. W. Keen, « Shaping the Future: Business Design Through Information Technology », Harvard Business School Press (1991)
A key reference (heavily quoted) that is still the most comprehensive book on the topic of IT economics. Much better than newer books, especially nice for rebuking naïve statements about SOA, Web Services … or other "silver bullets". The best counterargument against "IT does not matter", from N. Carr, that I have read.
L. H. Putnam, "Five Core Metrics: The Intelligence Behind Successful Software Management», Dorset House Publishing (2003)
The smartest book I have found about software metrics. All the mistakes I have made previously are neatly identified. Not only the difficult, multi-dimensional aspect of software measurement is well accounted for, but the book provides with practical and efficient methods.
J. Printz, "Coûts et durée des projets informatiques pratique des modèles d'estimation », Lavoisier(2002)
A very nice introduction to Cocomo and other methods.
I. Jacobson, "The unified software development process », Addison Wesley (1980)
A classical reference that still makes a very good reading. Extremely useful to understand the current state of "software development best practices"
P. Grosjean & al., "Performance des architectures IT », Dunod (2007)
Amazing book: very practical yet rigorous, covers a large scope of issues and provides with very relevant solutions to real world problems.
X. Fournier-Morel & al, "SOA, le guide de l'architecte du SI », Dunod (2008)
The best book I have ever read about Service Oriented Architectures. Obviously, covers the technical more than the governance side of SOA but is still the best book that I know of to understand what SOA is really about.
D.Gross, « Fundamentals of queueing theory », Wiley (1998)
One cannot talk about IT performance without a minimal background on Queueing Theory. This is one of the good introduction books (there are many others).
F.A. Cummins, « Enterprise Integration: An Architecture for Enterprise Application and Systems Integration », Wiley (2002)
My favorite book about IT integration. A practical book that hits all the key topics and does not shy away from the hard problem. Only 10% of the books that talk about EAI, SOA or integration infrastructure are actually relevant to "real world usage", most of them are just re-hash of marketing slides (all the glory, no guts J). This is one of the precious few.
M. Tamer Ozsu, P. Valduriez, "Principles of Distributed Database Systems », Prentice Hall (1999)
My reference book on database systems. Although it is quite complete and covers most of the issues relevant to distributed systems, it is still an easy read.
E. Marcus, H. Stern, "Blueprints for High Availability», Wiley (2003)
Wonderful book about high availability. All you need to know, tons of practical advice and many examples. A must-read for anyone in IT operations.
K. Schmidt, "High Availability and Disaster Recovery: Concepts, Design, Implementation », Springer (2006)
More refined and detailed than the previous one, a great reference book on robustness and redundancy.
R. C. Seacord, "Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices", Addison-Wesley (2003)
The only book that I know of that talks about "re-engineering of legacy systems" (what we call "refonte" in France) in a way that is consistent with my own experience of a CIO at Bouygues Telecom. All the hard issues are covered and the book is full of sound practical advice.

The next list is a reference list. These books are heavier, and are not meant to be read "in one shot". On the other hand, they contain "treasures of knowledge".

C. Jones, "Applied Software Measurement: Assuring Productivity and Quality », Mc Graw Hill (1996)
My "bible" during the last 10 years : all the hard numbers necessary to model software development costs and quality insurance. This is "the survival kit" for anyone who wants to introduce function points measurement.
J. Printz, "Architecture logicielle concevoir des applications adaptables » Dunod (2006)
This is a reference book on software architecture. It is very thorough, explaining all the hows with the whys. Very valuable to get a deep understanding on SE principles.
- Meinadier, "Ingénierie et intégration des systèmes», Hermes (1998)
  Still one of the best reference books about system engineering.
B. W. Boehm, "Software Cost Estimation with Cocomo II ». Prentice Hall (2000)
No one can afford to miss Cocomo II, since it is the scientific reference for almost all intuitions that one may develop after spending years of developing SW projects.
W. Perry, "Effective Methods for Software Testing », Wiley & Sons (1995)
560 pages that tell 90% of what one should know about software testing. I was fortunate to spend many years with people who had spent their live researching this topic at Bellcore, and find this book to be surprisingly complete and accurate.
D. A. Menasce, "Performance by Design: Computer Capacity Planning By Example", Prentice Hall (2004)
The best book that I have read about Performance modeling and capacity planning.

The last list contain fun books to read (at least from my perspective) which actually tell a lot about IT

M. Crichton, "Jurassic Park", Ballantine Books (1991)
This is still the best way I know to get acquainted with chaos theory and why it is relevant to understand IT failures J
C. Perrow, "Normal Accidents: Living with High Risk Technologies", Princeton University Press (1999).
A must read about industrial accidents (such as TMI: Three Miles Island) – The analysis and the proposed patterns are brilliant.
K. Kelly, "Out of Control: the New Biology of Machines, Social Systems and the Economic World", Perseud Books Group (1995)
My favorite book of all time, see earlier in this blog J
C. Hibbs, S. Jewett, M. Sullivan, "The Art of Lean Software Development", O Reilly (2008)
A wonderful very short books that contains one of the best introduction to lean I have ever read, and a wealth of advice about software development. Very concise but incredibly relevant.
F. Brooks, "The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition », Addison-Weslay Professional (1995)
Still relevant after so many years !
T. DeMarco, "Peopleware: Productive Projects and Teams (Second Edition) » Dorcet House (1999).
A treasure trove ! a non-nonsense inquiry into what it takes to be productive when writing software. The most incredible part is that the main contribution (such as the negative influence of disruption) are still unique to this day (to my knowledge)

Let me know about your own favorite IT books.

Saturday, January 24, 2009

SOA is much too young to be dead

Today I'll look into the SOA controversy that was generated by Anne Thomas Manes's SOA's obituary.

This may look like a slow response (90% of the blogosphere has already published their comments) but I will take some length to explain the issue (and this required a little time for thinking over). Surprisingly, one still get a very fuzzy and abstract idea about what SOA from reading all the positive answers to Ann Thomas Manes.

What is much clearer are the benefits one should expect from SOA (and they have been clear from 2003). SOA brings increased agility and reduced costs. The increase agility comes from reusability an the cost reduction comes from sharing services. Reusability comes in two flavors : direct (reuse of an existing service) and indirect=mash-up (quickly mixing existing services to produce a new one).

There are additional benefits (such as the ease to mix internal and outsourced services) but these two are enough to get anyone's attention. The question is: how to get there ?

1. SOA is a 3 step process

First, we decompose SOA into three steps ("doing SOA" means doing all three in sequence):

Service Definition
Service Architecture
Service Integration

Service Definition simply means to come up with the catalog of services. It is a "business oriented" step: the business processes are analyzed and the relevant business services are identified. At first it looks like the "simple" (nothing but simple actually) first part of a functional analysis of a new system. There is (much) more to it, because the role/user analysis is an integral part. The focus on reusability and agility starts here. Hence it is necessary to play with business scenarios, which further means that it is not an information system play. In the "acronym soup", the relevant terms are EA (Enterprise Architecture) and SOBA (Service Oriented Business Architecture) which have been introduced later on to insist on the "business/enterprise" dimension.

Service Architecture is the technical step that takes a large catalog as an input and produces a structured hierarchy as an output. A list of thousands of equally visible services is in no way a feasible approach to manage a large information system. Classical issues such as modularity, encapsulation, visibility, etc. must be taken into consideration. I write "classical" because this is the heart of software engineering and software architecture. Re-factoring techniques (the same that are used when writing a class or a component library) must be applied to produce services that have a better chance of being reused or being combined. I won't dwell into it today, but the role of data & models architecture is key (as I said, this step is the IS step).

Service Integration is what most people think about when they say SOA: linking all the services into an integration infrastructure. This is where all the technologic issues live (ESB or not, SOAP or REST, etc.). The term SOI (Service Oriented Integration - for instance, see this definition) was coined to denote this step.

These three steps help to grasp the huge difference between "SOA in the large" (with a large enterprise as its scope) and "SOA in the small" (SOA for one information system, either for a small company or one department). The effort repartition obviously varies, but here is the pattern from my own experience:

In the large: 40/30/30 (i.e., 40% of the effort is the definition part)
In the small: 20/10/70

It is interesting to notice that there is (almost) nothing new with "SOA in the small": service definition + architecture + flexible implementation is what has been making a good piece of software for 30 years.

2. What is easy/hard ?

Service Definition is hard when the scale gets big. Although producing the service catalog is straightforward for a department-size project, there are mutiple issues of heterogeneity and multiplicity of stakeholders that makes it much more complex to manage at the (large) entreprise level. Not only the management of the catalog gets harder as the size grows, but there is a subtle balance to be found between "common sense" and "formal methods". There are indeed methods/pattern/frameworks available for this first step but:

if this step is too formal, too many stakeholders are discouraged
if it is informal, the process crumble under its own weight.

The same is even truer with Service Architecture. It is quite simple when doing "SOA in the small" but gets harder as the size grows. Contrary to Service Definition, there is little help available as a methodology: hard learned IT experience is necessary. Unfortunately, this is not something that one learns through writing Powerpoints. If that step is poorly executed, the integration is more difficult, but - and this is the key point - little reusability or agility is observed later on. This is something that I am trying to teach during my "Information System Course" at Polytechnique, but this is a hard, technical topic which is better acquired through practice than theory.

Service Integration is actually not so difficult nowadays, because the technology is quite mature. I am not saying that it is easy, but it would be sad to ignore the progress that has been delivered by technology providers. There are still a few difficult issues (cf. next section) and, more generally, a large-scale integration project still exhibits all the possible pitfalls of a complex IT project, but there is nothing new here. Actually, when done "in the small", this is straightforward and there are so many success stories to prove it.

3. What is the maturity state of SOA ?

Using the same decomposition, here is my own assessment:

50% for Service Definition. This step is rather well explained, many good books are available (such as "SOA for profit"). It is still a difficult job at the enterprise level. There is indeed a governance issue (those who mock the "governance issue" have missed what Enterprise Architectuire is, and are still trying to accomplish "small scale SOI"). There is also a "pedagogy issue": an entreprise-scale effort cannot be undertaken if its meaning is not understood, by all stakeholders.
20% for Service Architecture. There are very few books that can help, and even fewer are directly relevant for SOA. The only one that I recommend to my students is "SOA, Le guide de l'architecte." (in French, sorry for my English speaking readers). I have tens of books about IT/software architecture in my private library, including great pieces from Fred Cummins, Len Bass, Richard Shuey, Paul Clements, Jean-Paul Meinadier, Jacques Printz, Robert Seacord, Roel Wieringa - to name a few -, but none of them (including my own two books :)) adress the issue of Service Architecture "in the large" thoroughly.
70% for Service Integration. The technology is there, it is scalable and proven (WebSphere or Weblogic being two classical commercial examples). Obviously, there are still technical issues. For instance,

distributed transaction & synchronisation is still a hard problem.
performance overheads (cf. SOA is not scale-free) still exist and makes the deployment tricky when response time is an issue.
monitoring & autonomic load balancing is difficult and/or not sufficiently developed.
service discovery and pub/sub architecture (EOA) are not as straightforward to implement as one might wish

But those are hard issues, nothing to be ashamed of :)

4. SOA is a practice, not a target or a technology

As David Linthicum said: "SOA is something you do, not something you buy". It is not something that you do once (a project) but that you do all the time, continuously.

This is what brought the creation of the sustainable IT architecture approach. Done right, SOA is a practice that transforms information systems for the better. According to the money that you invest, to the legacy and to the business constraints, the transformation is slower or faster, but it takes time and is never ended. The sustainable IT approach comes from a change of perspective: instead of always focusing on what is to come, on the new projects, one should search for more value from the "existing assets". Another "sustainable" principle is to save your strengths to last longer (hence the continuous improvement approach as oppose to the big bang).

It's a "culture thing". Part of it (cf. Service definition) is everybody's concern, hence it must be simple and must be pushed at the CxO level. Therefore, IT governance is a key component for changing the culture.

I will conclude by stating that SOA represents a "Chinese strategy" for IT management, in sense of François Julien. I could not urge you enough to read his book, but if you have not, a key principle is to distinguish between planned strategy (top-down, in the Greek fashion) and "building up your situation potential". A Chinese proverb says than one does not grow a plant faster by pulling its stem.

Similarly, instead of pushing "SOA technology projects", a better approach is to build up your "situation potential" : build a common data model, train everyone, build a shared consensus about what the service catalog should be, etc. Each project is then seen as an opportunity to move one step in the right direction.

Biology of Distributed Information Systems