Saturday, November 26, 2011

Lean IT, Devops and Cloud Programming


I have become more and more interested with lean IT over the years. It started with the book "The Art of Lean Software Development" by Curt Hibbs. 
I enjoyed this book because its introduction of lean is 100% compatible of what I learned from Jeffrey Liker and other great authors about TPS. This simple book helped me draw the lines between good software development methods such as extreme programming or agile programming, and lean management. For those of you who are not familiar with lean or extreme programming, here is a very crude summary of some of the most salient points of lean software development :
  • No delays: avoid as much as possible work that is sitting between two steps of the development process (what is called WIP – work in process – in the lean jargon). This is true at the process level (the goal is to design a streamlines/ “single piece flow” development organization) and the developer level. A practical goal is to avoid switching tasks as much as possible: focus on one thing and do it right!
  • Quality in (right first time) :  this lean principle translates into testing as early as possible (a tenet from agile programming) but also to use all techniques that improve the quality of the code, even at the expense of source code productivity since we all know that it is cheaper not to produce a bug than to remove it later. Here comes, for instance, the practice of pair programming and code reviews, but also “good practices” such as programming guidelines and standards
  • Fast delivery: the lean principle is to reduce the “lead time” of the software development process, which requires working on all stages. Removing in-between delays (cf. earlier) is necessary but not enough. Continuous integration is a core technique to achieve this goal, such as fast deployment techniques.
  • Short deliveries: It is more efficient to produce small pieces of software at a high rate than bigger pieces at a lower rate. This is another key principle from lean (“small batches”), which is doubly true for software development: not only are l smaller batches easier to build (a well-known law of software engineering), but the continuous evolution of customers needs and environment makes it easier to adapt with a small batch approach.
  • Less code: this is the application of the KISS principle !  Lean software development tries to stay away from complexity (see later in this post).  Unnecessary code may pop up from many sources, lean applies a technique called VSM (Value-Stream Mapping) and a posture of “muda removal”.  Muda (waste) removal means to go through the process with the “eyes of the customer” and remove all that does not produce value from her perspective. VSM is a tool that tracks the value creation and assigns it to each step of the process. Lean software development aims at producing the right product, without unnecessary features. It is also an architecture principle (stay away from complexity) designed at simpler and faster maintenance over the years.
  • Customer participation: the best way to produce only what is necessary from the user’s perspective is to ask her frequently ! This is why end-user/customer participation is a tenet of agile programming. When the customer is not available, the principle mixes with “small batches” to become : deliver fast, fail faster to succeed sooner.
  • Regular work effort : the leveling of the effort is a key principle from extreme programming, the equivalent of “heijunka” from lean programming. A few years ago, when I was still a CIO, I started thinking about “extreme IT” (applying extreme programming at the information system level), and the sustainability of the effort is a crucial point. Regularity has a counterpart, which is discipline. Using methods and tools (such as configuration management, source code versioning, automated testing) is crucial. One should take a look at the wonderful talk from Mark Striebeck on “Creating a testing culture” at Google.
  • Minimize handovers: complex tasks, such as writing software, are better accomplished when the responsibility chain is not truncated into too many segments. Working as a team is the only alternative to deliver complex projects (this is the topic of my third book). This is another insight from the Agile Manifesto, this is why today’s best practice it to assemble multi-disciplinary teams including software developers, marketers and designers.
There are a number of interesting books on this topic. The most classical ones are those from Mary Poppendieck, such as “Implementing Lean Software Development: from Concept to Cash”.  I had the pleasure of presenting my last book at the Lean IT Summit a month ago and to meet with Mary Poppendieck (a number of great talks, including hers, Michael Ballé’s and Georges Striebeck’s, are now available online).
There are also books which are not about software development per se but very closely related. I am especially fond of two books which I have reviewed in my French blog:
Thanks to Guillaume Fortaine, I have come to learn about Devops. Quoting from Wikipedia, “DevOps" is an emerging set of principles, methods and practices for communication, collaboration and integration between software development (application/software engineering) and IT operations (systems administration/infrastructure) professionals.  The more I read about Devops (Devops.com is a treasure trove of great articles), the more I find that Devops is the missing link between Agile/Extreme Programming and lean management (Toyota-style, hence Lean-Startup-style).  Although Devops claims to “help finishing what Agile Development started”, there are different flavors in what one may read on the web: bridging the development /admin & operations gap, making lean IT happen, making collaboration happen (and also),  bringing agility to operations,  which in turns makes it easier to leverage the benefits of new computing resources such as cloud programming. My first obvious interest with Devops has been the practical deployment of lean IT, as a followup of what I just explained earlier. For instance, Devops promote a number of interesting tools related to deployment automation, such  as GLU. It turns out that the “Cloud and Devops intersection” is also quite promising. Indeed, leveraging the strengths of cloud programming requires a shift in development/architecture culture that requires a “Devops-like” approach.

It happens that I gave a talk last week at the Model-Driven Day about « Complexity, Modularity and Abstraction” (the talk is available in the “My Shared Document” box on the left). The talk is about, among other things:
  • Complexity (why it is important, how to measure it and to tame complexity – since avoiding complexity altogether is not an option at the IS level)
  •  Sustainability: how to transform enterprise architecture into a regular practice. This is related to the concept of “extreme IT” : avoid the “heroic struggle” to move towards the “continuous transformation of the information system”. This is a major reason why I have been advocating for SOA as a company-wide enterprise architecture practice for many years.
  • Architecture-oriented services” : I made this pun to emphasize the difficulty to produce the “right” services through SOA. “Architecture-oriented” means services that have the right level of abstraction, that are modular and “easy-to-compose”. To my knowledge, there is no easy recipe for this, but the wisdom and folklore of 40 years of software architecture design apply.
  •  Cloud computing: I have added a small addition about “cloud-ready architecture” in the 4th edition of my first book. I strongly believe that Information Systems will change in a spectacular manner when we learn how to exploit massively parallel architecture (using tools/approaches such as Hadoop/MapReduce). Using Cloud Computing to provide with a few Saas-based front-office services is nice and relevant, but the big change comes when cloud-computing is applied to back-office services (provisioning, billing, data mining). This requires an architecture change, but mostly requires a culture change.
The thought that prompted me to write this post is as follows. Devops is the missing link between the themes of my MDD talk: managing complexity, delivering agility/modularity and moving to the new century of massively parallel computing (including, but not restricted to, cloud computing). I speak of "Cloud Programming" in the title because I agree with a comment made by Georges Reese: the Cloud is a computing resources defined by its API, i.e., a resource that is managed through programming. The exact quote is:
Cloud is, for the purposes of this discussion, is sort of the pinnacle of SOA in that it makes everything controllable through an API. If it has no API it is not Cloud. If you buy into this then you agree that everything that is Cloud is ultimately programmable.
 Not only the cloud is more agile (up/down scalability), it may be controled by the piece of software which is using it. From a systemic perspective, it's a "whole new game" :)

Sunday, September 25, 2011

Systemic Simulation of Smart Grids

This blog has been extremely quiet for two years, mostly because I was busy with other topics, including Lean and Enterprise 2.0. In my previous post, I mentioned that I was looking at “Smart Distributed Things” through two instances, Smart Grids and Smart Home Networks. I will not talk about the second in this blog, since it is work-related (we’ve come up with the acronym BAHN : Bouygues Autonomous Home Network).
Smart Grids make an interesting instance of the “smart distributed network” because of the extraordinary amount of interest/excitement/hype that exists around this topic. There is a range of conflicting opinions, ranging from “this is a straightforward and marginal improvement of existing networks which are already quite smart” to “smart grids are the backbone of a new sustainable society, based on communities, subsidiarity (organic, multi-scale, resilient organization) and self-aware optimal management of resources”.
I am not an expert with any of these topics: electricity, power grids, energy, storage … but like any citizen, I would like to build my own opinion about the future of our country and our planet, as far as energy and global warming are concerned. Hence the idea of a small systemic simulation has emerged during my vacation month (last month). I tried to assemble a crude model of all the different aspects of the “smart grid ecosystem” as we may understand it, without too much detail, just the broad principles. Each aspect is actually quite simple to explain, if taken in isolation (e.g., anyone may understand why it is smart to couple a solar panel with local storage). It is the combination of all viewpoints, together with the huge amount of uncertainty (about the economy, the speed at which technology will get cheaper, the speed at which behaviors will change, etc.), which makes this a hard topic.
This is where I got this insight: I actually have a method for complex embedded models where half is unknown and the other half is unclear: GTES (Game-Theoretical Evolutionary Simulation). GTES is the perfect platform to assemble conflicting views of what a smartgrid should be, conflicting views about how the actors should behave, and try to generate some sense. In a word, I am planning to build a “serious game” to have a closer look at smart grids.
Let me first clarify what I said about the range of opinions and introduce three views of the smart grid, which I could label: the “Utility view”, the “Google view” and the “Japanese view”:

  1. The “Utility view” defines a smart grid as adapting the power network to local sources (as opposed to a one-way distribution network that goes from few large GW production units towards millions of consumers), adapting to intermittent production sources (though storage and favoring flexible production units that can adapt to the power surges of intermittent sources like solar or wind) and using price incentives to “shave” demand peaks.  This is a “no-brainer” program (“what to do” is clear), where the major issue is price: most techniques show a cost/benefit ratio that is worse than current practice. To believe in this approach, you must believe that new technology prices will go down (e.g., solar, storage),   or that electricity prices will go “through the roof” in the next 20 years, or that global warming fears will drive a significant price for CO2.
  2. The “Google view” defines a smart grid as a change from a tree structure to a network structure (centralized to de-centralized), the use of market forces to create a dynamic and more efficient equilibrium between supply and demand, and the use of IT to provide information to all actors, including end consumers. The importance of signals (pricing and power grid control) is so important in this view that it is often said that telecom, IT and energy network will merge (something which I don’t believe for a second – but it shows the spirit of the importance of “smart” in “smart network”). Calling this the “Google view” is a friendly reference to “What Would Google DoJ  The core of this approach is the principle that a significant amount of efficiency would be obtained with a “hyper fluid market”, made possible through IT technology.
  3. The “Japanese view” is human-centered instead of being techno-centered. The goal is to change human behavior to adapt to new challenges (lack of resources, global warming, …). Smart grids are the backbone of a multi-scale architecture (smart home, neighborhood, city, region, country) where each level has its own resources and autonomy, resulting in a system that is more resilient and with more engagement as a consequence of more responsibilities. Smart grids support the necessary behavioral transformation through communities and constant feedback. I call this a “Japanese vision” because I have heard it beautifully explained in Tokyo, but the systemic approach is common to Asia as a whole. Because active communities are “engaging”, Smart Grids help get rid of waste (muda in the lean sense) such as useless transportation, un-necessary usage, etc.

These views do not necessarily conflict; it is possible to envision the union of all these ideas. However, as soon as one tries to imagine a convincing deployment scenario, there are a number of questions that pop to mind. Here are a few examples:
  • What part does local storage play? Can there be a smart grid without distributed storage? What price hypotheses are necessary to make this realistic, considering that current energy storage price are too high to justify large-scale deployment? Even if solar or wind becomes free, having to store the energy with today’s techniques make this approach uncompetitive (with today’s parameters, obviously)
  • What CO2 price would change significantly the cost/benefits analysis? Actually, this extends more generally to the pair energy prices + CO2 price, but CO2 price is especially significant. Because of the abundance of coal, and because of the increasing availability of new forms of gaz, CO2 price is, from a naïve point of view, the key factor that could change the economic analysis of introducing alternative intermittent sources of energy.
  • What is the systemic benefit of local management, that is, giving autonomy to local community to handle a part of their energy decisions, at different scales? This is a “system dynamic” issue, related to the handling of peaks, shortages and crisis. In a regular mode, there are obvious benefits to the centralized approach, from economies of scale to averaging pseudo-independent demand. The “pseudo” is a tiny word but full of consequences: the absence of independence is what produces complex systems and disasters, from the financial crisis to industrial accidents. It is the study of bursts and dynamic scenarios, with feedback loops, that may show the benefit of a “smart system” with faster counter-measures and learning abilities.
  • What could be the large-scale effect of dynamic pricing on self-optimization of customer demand?  The “marginal story” of “peak demand shaving” is beautiful: to remove the few hours of peak-production with gaz/oil turbines which produce CO2 at a high marginal cost, by returning some of the value to the consumer, to convince her/him to postpone parts of her usage. The most common example being heating (house or water) since inertia makes postponing a viable alternative. However the story does not necessarily scale so well, nor does it necessarily address the “resilience issue”. If additional capacity is needed no matter what to cope with some kinds of peaks, its marginal cost for dealing with the “other types” become quite small. We are back to a “system dynamic issue” that cannot be resolved with a “back-of-the-envelope” ROI computation, but requires a large-scale simulation.

My goal is to get a first simple set of answers to these questions. Obviously, using GTES is not going to give me a price or a definite answer, but rather a “sense of how things are interrelated and react as a whole (system)”. Here is a simplified description of the model that I have built last month:

      This is a “game theory model” with four actors:

(1)    The regulator (government) which controls the price of CO2 and may both favor renewable energy (investment incentive through tax breaks) or regulate them (impose joint storage with new intermittent sources). The long term goal (what we call a strategy in GTES) of the regulator is to reduce CO2 while maintaining the economic output of the country.
(2)    The utility (national energy supplier) which runs its production assets, distribute electricity and sets its price dynamically according to demand and primary energy (oil / gaz) price variations. One will notice that I implement an ideal world where price can be set up freely and change constantly, which is very far from the truth, but I want to address question #4. The long term goal of the utility is to make money, deliver a proper return on investment for its new acquisition (if any) and ensure resilience (the ability to serve the necessary amount of energy in the future).
(3)    The local operator who operates a smart grid associated to a city or a county. The operator manages all local production and storage capacity that is linked to the grid (wind turbines, solar panels, storage, etc.). It also operates a fossil fuel small-scale plant to provide additional electricity when required, although it may also buy it at wholesale prices from the utility. The long term goal of the operator is simply to make money :)
(4)    The end consumer who tries to reduce her electricity bill (both its average value and its expected maximum value) while preserving her comfort. Optionally, this may mean to reduce her CO2 emissions :)
-        
    The smart grid architecture is pretty naïve. I simulate a country that could be France or another European country, with one electricity utility, one thousand smart cities/ smart communities (with their own local operator), and twenty millions households. The national supplier produces most of the electricity when the game starts, using a mix of nuclear and fossil plants. The electricity is either sold to the local operators, or sold directly to the consumers (hence each operator has a given market share, that is the percentage of households that get their energy from this alternative supplier).

-          The heart of the model is the demand generation. Running the model once demand is established is not difficult, although the operational mode of the operator requires a careful description, since many choices need to be made (local production vs. wholesale buying, using energy from storage or storing energy for future use, etc.). The demand generator is actually crude and starts from a yearly and a daily pattern, to which a lot of random noise is added. The model allows both for independent noise (each consumer is different) and dependent variation (weather variations are shared by everyone in the same city).

-          Each actor can play his game through a few decisions (what we call its tactic in GTES). The utility most important decision is its pricing tactic. Prices are defined through a bunch of linear formulas, the coefficient of which are GTES tactical parameters. For those who are not familiar with GTES, an evolutionary algorithm (local optimization) is run to optimize (find the best value) these tactical parameters. Other decisions from the utility involve further investments in its production plant.  The local operator has a similar range of choices to make. It needs to define its operational production mode; it also needs to set up its pricing scheme. Once a year, it needs to decide about new investments, whether they are additional fossil fuel production capacities, renewable energy investment or additional storage. The end consumer can decide to reduce her demand when the price gets too high, at the expense of comfort (the tradeoff between the two being a design parameter of the model). The consumer may also switch from national to local providers and back. Last, the consumer may invest in “negawatt equipment” (such as house insulation or more energy-efficient equipment).

This whole model defines a “game”, which could actually be turned into a real game, SIM-style. Each actor is trying to maximize its long-term goal while making the proper “tactical choices”.
What are the next steps ?
  1. The code was written last month but I still need to run “20 years simulations” and make sure that the model is credible (the story told by one simulation run makes sense)
  2. I then need to “explore the tactics”, which means that when one of the actor make a decision, the impact on the game outcome is credible. This is the most time-consuming part, even with a simple model like this. This ensures that the game is “realistic”, even if it is obviously naïve.
  3. Apply game theory to find (Nash) equilibriums –  This is the fun part, since I have nothing to do and will leverage code that I have written for other problems. This is what GTES is designed for: looking at the conflicting strategies of the different actors.
  4. The last step of GTES is to randomize the “design parameters”. The model relies on a number of design parameters such as the demand-generation curve, the sensitivity to price or the efficiency of peak shaving, to name a few. I have no way to calibrate the S-curves that I am using, so I randomize the choice of these design parameters (Monte-Carlo simulation) to see if their value changes the conclusion that I would like to draw from these repeated simulations.
I’ll post a summary of my results if and when the computational experiments are successful. Today’s goal was just to share my overall analysis of the “smart grids” domain.

Sunday, January 2, 2011

Darwin, Lamarck and Service-Oriented Architecture

This blog has been sleeping in 2010 since I was writing a third book on « Business Processes and Enterprise 2.0 », an attempt to capture my past years involvement with lean management and information flows. Now that the book is over (I expect it to be published this spring), I am turning my attention back to autonomic/autonomous systems, networks and grids.

Although one could say that the promises of "Autonomic computing" (circa 2003/2004) have not materialized in the world of IT, the premises remain valid. My belief is that it will simply take longer to get effective technology in place in the world of corporate IT. As a research theme (which started much earlier and was very active in the 90s), "autonomous information technologies" (the combination of artificial intelligence, distributed control, adaptive software, quality of service monitoring … to name a few) is still very active.

I predict that there will be significant additional R&D efforts deployed in the coming decade, because of two related fields which are becoming "extremely hot", while requiring the same kind of scientific advances to transform hype into practical innovations:

  • Smart Grids, where the ambition captured by the world "smart" mirrors the goal of autonomic computing : self-adaptive, self-organizing and self-healing. There is no need to explain why smart grids are strategic to this century, but it is also easy to recognize the implicit difficulty of the endeavor. The heart of the smart grid principle is to evolve from the centralized management of current power networks towards a distributed and adaptive design, which feasibility remains to be proven on a large scale.
  • Home Networks, which are growing like mushrooms in our houses, and which have already reached a complexity level that is unacceptable to most households. "Smart houses" are necessary to fulfill the promises of multiple industries: energy – where smart, energy-efficient houses are actually parts of the previously mentioned smart grids -, content and entertainment – IP content anywhere, any time, on any device, home security and management, healthcare – for instance, out-care within the home, etc. The various control/distribution networks may all share the IP protocol, the complexity of provisioning, pairing, routing and interconnecting is rapidly becoming an impossible burden. Here also, the words "self-provisioning", "self-repair" and "self-discovery" are quickly becoming requirements from the customers.

It would not be difficult to define a SDT (Smart Distributed Thing) that regroups the challenges of distributed information systems, smart grids and smart home networks … With this first post of the year, I'd like to explore three ideas which I have been toying with during the past few months and which I intend to explore more seriously in the future.

  1. There is a lot of wisdom in applying biomimetics to replicate evolution to produce autonomous systems. This is especially true for smart home automation networks. Rather than designing "a grand scheme" of "the smart house's nervous system", it is much safer to start with simpler subsystems, add a first layer of local control, then add a few reflexes (limited form of autonomy), then add a second layer of global control … and end up with a "cortex" of advanced "intelligent" functions. Multi-layered redundant designs such as those produced by evolution are more robust (a key feature for a home automation control network), more stable (a key insight from complex system theory which is worth a post by itself) and more manageable. The need for recursive/fractal architecture is nothing new: I wrote about it with respect to information system architecture many years ago in my first book. I went from a global architecture (which was common when EAI was a catchwordJ) to a loosely coupled collection of subsystems (so called fractal enterprise architecture), for the same reasons: increase robustness, reduce operational complexity and, most importantly, increase the manageability (the rate of evolution is not constant over a large information system). There is much more than fractal design involved here: the hierarchy of cognitive functions from low-level pulse, reflexes, to skills and then "creative" thinking, is equally suited to the design of a SDT.
  2. Autonomous systems tend to scare end-users unless they embody the principles of calm computing. Calm computing is derived from the concept of ubiquitous computing (cf. the pioneering work of Mark Weiser at Xerox Park), and addresses the concerns that emerge when "computers are everywhere (ubiquitous)". Calm computing is very relevant to SDT, I could summarize the three main principles as follows: a smart ubiquitous system must act "in the background" and not "in your face" (it must be discrete J), it needs to be adaptive and learn from the interaction with its users (the complexity of the users must be recognized in the overall system) and, most importantly (from the user's perspective), it should be stoppable (you must be able to shut it down easily at any time). This becomes much easier with a fractal/layered design (previous point) and more difficult with a monolithic global design. There is a wealth of ideas in the early papers about calm technology, such as minimizing the consumption of the user's attention.
  3. The emergence of software ecosystems most often needs to be guided/shepherded, and rarely occurs in the wild as random events. This is a key point since it is widely acknowledged that software ecosystems (such as iPhone applications) are where innovation occurs (and where value is created from the end-user's point of view). In the realm of home network, I have been advocating for open architecture, open standards and (web) service exposition for many years, thinking that open standards for the "Home Service Bus" would attract an ecosystem of service providers. You create the opportunity and evolution/selection of the fittest does the rest (Darwin). The last two years spent thinking about sustainable development (i.e., analyzing complex systems' architectures) and looking at successful software ecosystems have made me reconsider my Darwinian position. I am much more a follower of Lamarck these days: I see a "grand architect" in the success of many application stores, iPhone being the obvious example. Open standards and open API is not enough. The spectacular failure of major Web Service exposure programs from large telco is a good example. You need to provide SDKs (more assistance for the developper), a "soul" (a common programming model) and a sense of excitement/challenge/cool (which obviously requires some marketing).


 

This third point is precisely the cause for SOA (Service-Oriented Architecture). This is an observation that I have made earlier: reuse in the world of corporate IT does not occur easily or randomly, it requires serious work. To put it differently, to come up with a catalog of reusable services is not to deploy a service-oriented architecture (with a Darwinian hope that the "fittest" services would survive). To make SOA work, you need to organize (hence the word "architecture"), promote, plan and communicate. There is a need for a "grand architect" and a "common sense of destiny" for SOA to bring its expected benefits of sharing, reusing and cost reduction.

 
Technorati Profile