Biology of Distributed Information Systems

Sunday, June 1, 2014

12 Principles of Lean Software Factories

This month’s post is simple one, which presents the concept of LSF (Lean Software Factory) through 12 principles. It will not bring forward new ideas compared to by previous posts, but it is a fresh way to look at the combination of agile/scrum/lean/devops without over-thinking about the influences or the relationships between different schools of thinking. This list of twelve principles is taken from the talk that I gave at the Lean Summit in Lyon. As I stated in the introduction, this is a "Toyota Way" "how to manual" for a software development team.

1. Organize work around cross-functional united teams

Team works leverages the strength of strong ties, that is, the links that create themselves between a group of people who work together all day long. It creates a shared context which is the most efficient form of implicit communication.
A team should leverage talent diversity through cross-functionality. Cross-functionality means not only that we have multiple skills within the team (which is necessary to tackle complex time) but that a fair amount of substitution is possible (many team members can lend a hand to any other member), a key for effective cooperation and flexibility.
Unity and versatility are mutually strengthening one another.
There is no longer a contractual vision of a client-supplier relationship with external hired help. Each member of the team has the same rights, which means that outside suppliers become partners.

2. Teams operate on a common synchronous time

Face-to-face communication replaces email for internal team one-to-one communication. This leverages the strength of both tone and non-verbal communication.
Every day starts with a stand-up meeting, which replaces a fair amount of one-to-one communications. The stand-up meeting builds the team spirit and common focus on the shared goal. Everyone tells where she or he stands (achievements of the previous day), explains what the objectives for the coming day are and share possible concerns.
The team operates on a common shared time, which is the customer’s time (following the lean concept of takt time). This is a clear departure for asynchronous work which has become the default mode for engineering in the past decades. The importance of synchronous work is well explained in “The Lean Startup”.

3. Customer-centric organization, for real.

The customer needs to be present on the software development premises. This is symbolic, through the availability of a customer wall or a customer room, which dynamically collects and display end-user problems, insights and aspirations. This is also physical, through the presence of a “customer-proxy” role within the team.
Software development and communication is organized around « user stories ».
Continuous improvement is a cornerstone team activity, which is not de-prioritized to add new features. Lean management principles of “zero defects” and “right on the first time” are applied thoroughly, because they have proven to produce customer satisfaction.
Last but not least, a customer-centric organization is bound to change its culture from the traditional project culture of software development to a product culture.

4. « Fail early to succeed sooner »: test as early as possible

« Test-driven development »: code developers need to start their programming with unit tests.
Testing must occur end-to-end, that is, from the early unit testing to the instrumented « test during production » (i.e., be able to run tests on deployed software). The (classical) lesson from software engineering is that everything should be tested “as early as possible” (unit testing, when building, when integrating, etc.).
The only way to run tests continuously is to automate them. Continuous building/integration and continuous testing are synonymous.

5. Iterative progress through constrained « small batches »

Small batches yield better performance and more motivated teams. It is also the best way to keep teams small, which is known to be more efficient.
Time Boxing: you fit the content to the box and not the opposite! To keep a synchronous planning (delays break cooperation and are known to be very expensive), you keep to your sprint schedule and adapt the workload dynamically.
Incremental development is better at adapting to a continuously changing environment: each « small batch » gives the opportunity to listen, reflect and adapt the product strategy and priorities.

6. « Show & Tell »: Love your code !

A software factory operates on the principle of fast changing code, which is why code must be easy to read and easy to understand, by all members of the team and not only the person who wrote it. Coding standards and pair programming are known techniques to produce easier to maintain code.
Team code reviews are a vital part of the LSF culture. On the one hand they create the right level of appropriation and common understanding that is necessary for the team to evolve its software asset. On the other hand, they create the “software pride” attitude, which is an engine for quality and innovation. This is very close to the “love of cars” that you find in a Toyota factory.
Code must not only be well structured and elegant, it must also be taken care of. The 5S practice of Lean Management applies to code : Sort (reduce the code base, apply quality metrics), systematize (organize into modules, packages & projects, apply coding guidelines), shine (clean up, improve test coverage, code reviews), standardize (make it into a set of practices), sustain (run the practices as part of the culture).

7. Use walls as tools for collective learning

Visual management is a great way for the team to communicate as a whole and to grab the dynamic “music sheet” of the product that is being built.
Walls and white boards are amazing collaborative tools. This is a proven scientific fact : a white surface that you can write on or pin things onto leverages many important features. Many people may work at the same time; multi-scale editing is easy (working at different levels of abstraction at the same time); information density is quite high; body language and dynamic processes are part of the experience.
Walls should be used to display all that is necessary to know about the software product, including its architecture and how it should operate. Architectural diagrams do not belong in folders or inside laptops, they should be displayed to contribute to the continuous training and education of the team.

8. Each team member produces what the other needs just in time

Use Kanban visuals to represent the team’s work in process (WIP). The first benefit of the Kanban display is to share the amount of ongoing work / use cases, make sure that nothing is forgotten, and avoid over committing (accepting a work load that this too much).
The Kanban display is a grid where the different steps of the software development are represented, which makes transitions from one team member to another easier because everyone knows the other’s current workload (second step of maturity). This is also where the cross-functional nature of the team may be put to good use.
The last maturation step occurs when each team member adjust her or his work according to the capacity of the next team member in the process chain. This is the “pull” control flow of lean management, which requires time to build but yields more efficiency through shorter development cycles.

9. Industrial tools for end-to-end software management

One cannot run an iterative and fast development cycles without an industrial method and the use of many tools. Code management benefits from a large number of tools, many of which may be found in the open source community: version management, profiling, dependency tracking, software quality tracking, etc.
Configuration management is the cornerstone of continuous integration and continuous deployment. Software builds need to be fully automated, including the management of network, hardware, and other configuration options.
The endgame of the software factory is to build the DevOps target of programmable hardware.

10. Continuous software integration: streamlining without waiting or accumulated surprises

Continuous integration means to build every day a fully functioning complete system. The rhythm may vary but the practice of building every night a system from the code that was committed during the day has shown its merits.
This means that the integration process, which used to be tedious, will be run hundreds of times during a development cycle. Therefore, it needs to be fully automated. This goes hand in hand with automated testing. The software developers find every morning the results of running the newly built system on the test library.
Continuous integration has the great cultural advantage of reminding everyone that the whole (system) is more important than the part (the daily pages of new code).

11. Team problem solving as collaboration & learning exercises

Team problem solving is used to solve problem and continuously increase the quality of the product. However, there are many other side benefits: team problem solving fosters collective learning of the functioning of the system that is being built.
Collaboration and collective learning is anything but easy. Therefore, it must follow a time-proven ritual, such as Kaizen. The lean practice of Kaizen does more than solving the quality problems that are being addressed: it creates a collective understanding of the system and the various roles within the team that prevents the occurrence of many future problems.
The practice of Kaizen revolves around the lean concept of standardized work. Standardization does not mean to freeze a way of doing things, it is an evolving body of knowledge that captures the collective know-how and is used to continuously set new challenges.

12. Deploy continuously to support iterative innovation

Following the DevOps principles, software products are deployed following a fast and regular rhythm (which is different for each company). The fast pace is critical to build the customer feedback learning loop.
Continuous delivery requires risk management though the principles of concentric community circles. You start with a small test population and you progressively extent to your complete customer base through steps that may be undone easily.
Each incremental development process (when you add small pieces after small pieces) is bound to produce junk over time. Thus refactoring and “tending the garden” are critical practices of agile development cycles. The new world of software is not about building a system but growing a platform.

This list is actually a simple collection of well-understood principles because it only represents what needs to be shared with the software development team. This is a “bottom-up” recipe which is an easy sell once the will to build an agile software factory is established. The hard part about lean software is stakeholder’s management (this is worth another post):

The role of management is deeply different from a traditional software development viewpoint.
Agility (incremental, test and learn) is a business, not a technical, mindset.
The benefits of a software factory (building a capability to continuously deliver an evolving platform as opposed to assembling a system) need to be shared and understood by the CEO.
Customer-centricity has to be deeply built into the company’s culture.

Sunday, April 27, 2014

Software Ecosystems and Application Sustainability

Today’s post is a set of simple, yet somehow deep, thoughts about the systemic nature of different ecosystems related to software. I was trained in the 80s to think about software costs in a “traditional software engineering” manner, using KPI, metrics and a spreadsheet (using cost models that were popularized by Peter Keen in the 90s). Somehow, this shows in my second book where one may find references to the classics (Barry Boehm, Casper Jones, etc.) in the bibliography section. What characterizes this way of thinking is that it is a static approach (even if the system changes, one think about a “snapshot” taken at a given time), controlled (one assumes that all stakeholders cooperate under a common control) and global (the cost model operates on what is thought to be a complete picture).

Life in this century is different when it comes to software. Software is a “live thing” – in the sense that it constantly evolves to adapt to its environment -, mostly distributed, with a large number of stakeholders whose strategy escape the control of the software developer. Hence static should become dynamic, controlled should become collaborative and global should become distributed. The dynamic and complex relationships between the stakeholders and between the various distributed players who contribute to building a software piece yield the “ecosystem” label. This word is borrowed from biology and is a signature of complexity.

This observation is actually one of the reasons for the title of this blog “Biology of Distributed Information Systems”. It helps to think about software and information systems by borrowing concepts for biology and ecology, and it is definitely necessary to switch from a static to dynamic analysis.

This is a first post on this topic, so I will keep things simple (hence somehow incomplete and arguable), and focus on three ecosystems:

The OS, platform and application ecosystem
The open-source ecosystem
The application developer ecosystem

I must apologize in advance to real software experts :). First, this is a post intended for readers with no precise skills nor knowledge about software. Second, I will reason in an “abstract category” way that will not dive into interesting but difficult distinctions. For instance, in this post, an “app” is a piece of interactive content, whether it is a “true application” written for a smartphone, a simple HTML page, an HTML5 page decorated with Java script or an hybrid mobile app. For this first post, I am aiming at a “big picture from 10000 feet up”.

1. Software Global Ecosystem

The starting point of the argument is the need for software that evolves constantly. You may accept this at face value because it is a commonly heard argument. If you need convincing, the need for constant evolution comes both from the technology (the “what is possible today ?” perimeter changes constantly) and the users. The complexity (i.e., richness) of software usage today means that the “user is in the driver seat”. That is, software needs to be co-designed with users, which is of the principles of Lean Startup (to name one reference, hundreds would apply here). This leads to an incremental model, which in turn requires a (much) faster code production rate. I will assume that you buy this argument, since this is not the topic for this post, and it is a fairly common assumption.

From this we derive two key consequences for software in the 21^st century (as opposed to the century when I was trained as a software developer):

Much more innovation is needed, which requires the help of an open innovation model. This leads to the concept of platforms, API and apps.
Much cheaper software production is needed, which itself requires a new level of sharing/reuse, based on a common/universal software architecture.

Software productivity, as defined by the cost to produce a function point, is improving slowly. This is a topic which I have addressed in depth in the previously quoted book. I have a vested interest with this question since I started my career as a computer scientist trying to build tools (languages) that would increase this productivity significantly. It turns out that the world needed a much more efficient way to reduce software development costs as we just saw and has found it through massive reuse, thanks a to common software architecture :

Open OS : open operating systems (LINUX, Android, etc.) have become, thanks to open innovation in the form of open source, massive repositories of reusable value. For reasons that will become clear in Section 3, the world needs as few of those as possible.
Platforms: on top of OS, platforms have emerged. One may think of the most common open source tools such as Apache or mySQL, the GAFA platforms or the web browsers. The rise of platforms over the last 20 years is coupled with the rise of API (Application Programming Interfaces) and the associated technologies (XML, Web Services, REST, JSON and the likes).
Apps (interactive content) : this is what the end user sees and interacts with. The combination of SDKs and platform APIs, together with open-source libraries, have made the production of apps orders of magnitude more efficient than when I started writing code 30 years ago.

Please note that the word “platform” is usually ambiguous: it may mean a cloud/service platform (a back-end platform that serves a front-end app) or an open back-end that collaborates with a many apps (or other service platforms) through APIs. Here I use platform in the second sense; the first is always included in the “app” perimeter because of “device agnosticism”. That is, to let the end user pick whichever device is more suited to her current context, each mobile app must have a dual cloud service platform. So, in this post, each “app” comes with associated set of back-office/cloud services and I reserve “platform” for the implicit open innovation approach (cf. “L’age de la multitude”)

2. Critical Mass and Software Usage

Even if software development is incremental, launching a successful product requires building a “critical mass” of value before one may start the “lean startup” positive feedback loop of co-creation. This is the “V” in MVP (Minimum Viable Product): there is a threshold of value brought to the user that one must pass before the percolation model of viral adoption (helped with proper marketing) may kick in. The analogy with living organisms (“SW as a living thing”) is relevant here: software requires growth, constant change and a sustainable equation of user growth. The MVP aims to reach the tipping point when viral adoption becomes sustainable (what Eric Ries calls “getting traction”).

The “critical mass of value” that is required for this tipping point varies considerably according to the pain point that the piece of software is trying to alleviate and according to the current state of the art. It may be measured in terms of function points (how rich an experience is necessary) and social weight. The complexity of modern experience comes from their social nature (if you think about it, most apps on your smartphone nowadays have a social component). Hence installing a new habit requires fighting against Metcalfe’s Law, and displacing a previous social usage requires even more efforts. The value critical mass may be large, which explains why some legacy Microsoft products such as Word, which I am still using to write technical papers, have not been easily replaced by open source alternatives.

To reach this “value critical mass”, someone must invest an initial significant software development effort. In a complex world, where the risk to fail is high, one must reduce this initial software development cost, as we explained earlier. This is also a signature of the complex environment we live in: we must switch from ROI (return on investment) to the affordable loss principle.

The percolation model shows why the ROI principle is no longer relevant: it is very hard to predict how well a successful MVP will percolate. The world is full of software startups with amazing valuations because their app found its way to massive deployment and usage. But it would be very difficult to predict such success a few years earlier, when the MVP was still a prototype.

The affordable loss approach means reducing the development cost to something that “you may afford to lose”. This means leveraging the previously shown layered architecture, and mostly leveraging the strength of the open source community. Open source software is a machine that is constantly churning out software platforms that start their own journey towards fame and critical mass. The quality of open source software is directly related to its adoption (because good software is built incrementally through feedback – a key axiom). Adoption is inversely proportional to genericity, hence a using open source software is a “connected art”. One must understand the communities’ sizes and dynamics to select “the pieces of the puzzle”. Open source software yields by construction a nested / layered structure of software libraries with a combination of very high quality stable platforms for the common needs and more experimental gems with specific capabilities. This is why the word “ecosystem” is so relevant to open source software. One should not think of a catalog of free software, but of a nested hypergraph of communities, where a collaborative price must be paid. This price is measured in (participation) time and (code) sharing. This represents a culture shift for most software organizations, but the efficiency of those who “play the game right” is such that it becomes the only game worth playing.

This is just a hint of what a proper open source strategy should be, since there are so many aspects that I am not touching here. Open source is not only about software libraries, it is also about development methods, tools and processes, cloud computing and hardware, to name a few. As a follow-up, I would suggest reading Octo’s great book “Les Géants du Web”, take a closer look at DevOps or leverage the value that is found in the Open Compute project.

3. The App developer's equation

This last part looks at app sustainability from the perspective of the developer. I have used the following (abstract) equation in my presentations for the past 10 years:

Attractiveness = Market x Generosity x Value / Effort

In this equation,

Market is the size of the potential market size that a given platform is proposing. This equation was developed to understand the fate of mobile OS, but it applies to all kinds of platforms, from cloud service platforms that propose APIs (another dream of phone operators for the past 10 years) to connected objects.
Generosity is the share of the revenue (app price or advertizing revenue) that is sent back to the developer. For instance, Apple keeps a hefty 30%, whereas Android is more generous. Set-up costs should be factored in, for those platforms where some form of license or tools investment is still necessary.
Value is what the platform brings to the developer, as far as the end user experience is concerned. When I said “abstract” earlier, I meant that I don’t have a formula to measure “Value”. Most often, it is a judgment call from the developer, who evaluates what innovative and relevant services may be developed with the platform. There is subjectivity involved, such as the infamous “cool factor”, which favors sets of APIs & features from which “cool stuff” may be built.
Effort is the amount of time it takes to build “one unit of value”. This is where the difference stands between great players, who provide the right SDK, community support, testing services, and an efficient delivery (store) platform, and other less qualified players. Many APIs exposure programs have failed during the past 10 years because the effort expected from the developer was much too high. This is also why one should enroll help from qualified actors such as Apigee to develop an API strategy.

Roughly speaking, the equation is an abstract form of “Expected income / Expected Effort”. I have used this equation in the past years to explain why there would be two or at most three mobile open OS in the future, but it actually tells a lot of things. You may understand why Microsoft announced (at last) that Windows Phone would be free in the future. The “Market factor” yields a “winner take all” dynamic that we have observed for many platforms (the Matthew effect : the platform with the more users attracts more developers, benefiting from more open innovation, hence attracting more new customers). It also gives a few insights for a successful platform strategy:

Growth : get a critical mass of customers, as quickly as possible. To jumpstart the virtuous cycle (that is, enroll app developers while the market is still not here, one must use rewards and gamification – such as hackathons).
Expose as much value from your APIs as possible, with a focus on differentiation that is exposing stuff which is both useful and not readily available elsewhere. This is probably the most strategic factor to predict the success of connected objects in the future, or larger domains such as the connected (smart) home.
Reduce the effort for the developer by embracing the open source and Web standards (languages, development tools, API styles, libraries, etc.). The adjacent illustration of the “Fun vs. Effort” is taken from a humorous site, but thinking in terms of value/effort is critical to system analysis.

This equation gives also a way to evaluate the intrinsic value of a platform. It follows from what was said earlier that the value is the capacity to generate revenue streams from apps. This, in turn, is mostly related to accumulated user data. This leads to the idea, reported by Henri Verdier, that data is the new code. The algorithms change constantly, and the best one are produced by external developers (hence the open innovation paradigm). What changes more slowly is the API structure and what accumulates over time is the amount of user data. This is important enough to be worth a future post when I report about the work of our group at the NATF on Big Data.

Friday, February 21, 2014

Complex systems, Scale-free networks and Affiliation networks

This post is a follow-up of the previous one about complex systems. I will further focus on the fifth point which said that efficiency in a complex system is strongly related to the capability to support information exchange flows, and will expand on the importance of Scale-free networks . The importance and the size of information flows are directly related to the complexity of such systems, which is the amount of interaction between the components.

What makes a good information flow network for a complex system? Here are five characteristics that spring to mind:

Latency : to minimize the time it takes from some information to travel from one subsystem to the other.
Throughput : to simultaneously transmit large amounts of information between all subsystems.
Resilience: to continue functioning when some subsystems or some links become unavailable. These three characteristics are universal for all complex systems, for instance information systems or enterprise communication channels.
Searchability: to ease the task of finding related subsystem through the exploration of the communication network. This property is related to dynamic growth and self-organization. In an autonomic system, automatic discovery of new features and new component is at the heart of the system’s dynamic organization.
Cost: to minimize the total weight of the communication network, whether we talk about energy, mass or dollars. Related to cost is the scalability of the communication network structure. Scalability means that the structure may easily evolve as the complex system grows.

To reduce latency, one must reduce the diameter of the network, which is (roughly) the average path length. The easier way is to add additional links. Similarly, to increase throughput and resilience, one must rely on path redundancy (the fact that many paths exist for routing one flow). However, there is a double trade-off: increasing the average degree and the number of edges both increases the cost and reduces searchability.

Nature seems to have found the perfect solution for this trade-off with the scale-free network structure. A scale-free network is a graph whose degree distribution follows a power law. Compared to a random graph, this means that there is a higher frequency of highly connected nodes, with large degrees. Scale-free have many wonderful properties, as explained by Duncan Watts or Albert-Laszlo Barabasi. Their diameter is logarithmic in their size, and they are very resilient, that is their level of connectivity is weakly changed when some nodes become unavailable. The name “scale-free” comes from the self-similarity that the degree distribution implies. Somehow, a scale-free network may be seen as a “fractal structure”, which makes it an interesting candidate for self-growth and self-organization.

What has been found in the past 20 years is that scale-free networks are everywhere, both in the nature-made complex systems (such as the network of chemical reactions within the brain), and in the human-grown systems that incorporate feedback and learning, such as the Web (network of pages) or the Internet (network of computers). Let me quote the introduction of « Scale-Free Networks : A Decade and Beyond » from Albert-Laszlo Barabasi : “For decades, we tacitly assumed that the components of such complex systems as the cell, the society, or the Internet are randomly wired together. In the past decade, an avalanche of research has shown that many real networks, independent of their age, function, and scope, converge to similar architectures, a universality that allowed researchers from different disciplines to embrace network theory as a common paradigm.” As with any general big idea, this is an approximation of the real world, and there are some debates whether real networks have an exact power law for their degree distribution. Still, it is both a useful and powerful concept, when trying to design communication networks.

I will now write a brief summary of « Linked », a great book by Albert-Laszlo Barabasi. I have read this book many years ago, and promised to give a review in my other blog, but never found the time to do it. Still it is very relevant to what I just wrote (together with many other books which I have selected in this post ) since it contains a lot of details and examples about the importance of scale-free networks. The following is a short list of relevant key ideas that are well illustrated in this book, with no claim of completeness:

The book starts with the concepts of diameter and average path length. Throughout the book, many examples are given of really large networks with small diameters. For instance, the Web (URL network) diameter is 19. Another interesting example is the molecule interaction network in a living cell, through chemical reactions. The “diameter” is only 3 (three degrees of separation). Lately we have learned that the diameter of Facebook social graph is 4.7.
By looking more closely at these networks, we see that the short diameter is not due to the number of edges but the presence of “connectors” (hubs), as defined by Malcom Gladwell in “The Tipping Point” J This is true for cell reactions, where a few molecules interact with many others. Small-world networks, as defined by Watts and Strogatz, also exhibit a higher clustering coefficient than random graphs. These small-world structures may be thought of as small tightly connected groups, linked by connectors – hubs with high degrees.
This leads to the concept of scale-free networks, by looking at the node degree distribution law. The presence of connectors is the result of power laws, which are also called “fat tailed” because the number of nodes with very high degree is much higher than a typical “exponential decay” law. Another interesting example of scale-free networks is the graph of word co-occurrence in natural language.
A good part of the book deals with how scale-free network may be grown, that is how they emerge in real life. This leads to the powerful “rich get richer” paradigm (also called the Matthew Effect), where the probability of creating a new edge is proportional to the existing degree. Growth is a signature of Scale-free networks. I quote from the book : “The power laws emerge – nature’s unmistakable sign that chaos is departing in favor of order. The theory of phase transitions told us loud and clear that the road from disorder to order is maintained by the powerful forces of self-organization and is paved by power laws”.
A very interesting part of the book deals with resilience, with examples drawn from biology such as the protein network in our metabolism. There is an interesting comparison with hierarchical networks (such as organizational charts in a traditional company or electricity distribution network) which are less fault-tolerant than scale-free networks (even with added redundancy for the high value links). Another quote: “The coexistence of robustness and vulnerability plays a key role in understanding the behavior of most complex systems. Simulations have shown that the protein network refuses to break apart under randomly generated mutations.”
Scale-free networks are graphs, with edges between two nodes that only describe binary interactions. Most of real world complex systems use more complex “n-ary” interactions, which could be described with hypergraphs, two-mode networks or affiliation networks. For instance, the meetings between coworkers in a company or the chemical reaction networks are hypergraphs. A meeting is an hyper-edge since it binds many participants; a chemical reaction is also an hyper-edge in the molecule graph. It is easy to model an affiliation network with a regular bi-partite graph (just add a few nodes for the hyper-edges), so this is not a big technical difference, but more and more interest is given to affiliation networks since they are very common in the real world of complex systems.

Five years ago I decided to see if Duncan Watts’s results would also apply to Affiliation Networks. I wrote a paper entitled “Efficiency of Meetings as a Communication Channel : A Social Network Analysis” which I presented at the “Management and Social Networks” conference in Geneva (2012). The main findings may be described as follows:

I have shown that the most efficient meeting network structure relies on small meetings that have a high frequency. There is no surprise here, since this is a tenet of agile companies which are organized around daily short team meetings. Still, it is interesting to see that this is a deep structural property of the underlying network.
I have proposed a “latency performance indicator” that predicts the speed of information propagation as “ #of-monthly-meetings * log(#people-that-one-wants-to-communicate with) / log(#people-that-one-actually-meets-in-a-month)”. For those mathematically inclined, one may retrieve the best practices (fewer meeting, frequent meetings, a few large meetings) within the formula.
The most interesting piece is the emergence of a small-world structure as the most efficient meeting network, which is a hybrid combination of small team meetings and a few larger meetings. This reproduces, in the case of an affiliation network, the results found ten years ago by Duncan Watts. It tells that companies should reproduce the diversity found in nature, implement path redundancy and combine many really small and frequent meetings such as SCRUM stand-up meetings together with a few overlapping “town-hall” meetings (large audiences).

Scale-free networks are similar to what sociology calls “ambidextrous organizations”. Ambidextrous organizations leverage the power of cliques and the strength of weak ties. The “power of cliques” is precisely the strength of team work, small group of people that are all connected to one another (hence the clique name), establishing “strong ties” (which means frequent in the world of social network science). The “strength of weak ties” is the law established by Mark Granovetter that says that we need to use our “weak ties”/extended network to get out from difficult or exceptional situations. “Weak ties” refer to people that we see rarely (as opposed to strong ties) ; the “weak ties” make the edge of our social network, they provide the diversity of viewpoint and culture which is often absent from the core of our social network (since “strong ties” tend to be very similar to ourselves).

The idea that complex systems sciences in general and social network structures in particular, are relevant to enterprise organization is becoming more and more popular (this is precisely the topic of my other blog). I will conclude with three examples which are closely related to this post, since those three theories attempt to improve management efficiency through a better-designed information network:

Sociocracy uses circles as a team structure, and doublelinks (each intersection between circles is represented by two individuals) to implement redundant information propagation paths. The illustration is taken from Wikipedia.
BetaCodex is a management theory and practice whose claim is to “organize for complexity”. It is based on a cellular network structure, which draws its organizing principles from biology. The tree structure is replaced by a denser network of circles (with a clear reference to sociocracy), providing shorter and more resilient information propagation paths.
Holacracy is another recent management theory that draws on complex system theory. Here again we find a system of self-organizing circles (with a similar influence from sociocracy). The most defining feature of holacracy is “to organize around purpose” (cf. the fourth principle of our previous list).

Tuesday, December 31, 2013

Seven Keys for Complex Systems Engineering

I gave a talk early this year at the “IRT SystemX” inauguration, about the challenges that occur when engineering “Systems of Systems”. This talk is a quick introduction of what we can learn from complex systems when designing large-scale interactive industrial systems. Complex systems are defined by their goals (purpose) and a set of sub-systems with rich interactions. The complexity of these interactions yields the concept of emergent behavior. Complex systems have a fractal nature, that is, they exhibit multiple scales, both from a physical/descriptive level and from a temporal level. Complex systems embed memory and have the capability to learn, which makes them both dynamic and adaptive systems. They interact constantly with their environment, which means that a dynamic vision of flows is more relevant than a static description of their top-down decomposition. Most complex systems renew their low-level components in a continuous process. Teleonomy and process analysis are, therefore, the most useful approach to capture the essence of a complex system.

I have become gradually fascinated by the topic of complex systems because I find it everywhere in my job and my own research. Complex systems is the right framework to understand the management and the organization of modern enterprises. This is the topic of my other blog. All that is said about complex systems in the previous paragraph applies to a company. I also found that this applies to information systems as well. The main reason for creating this blog was the realization that the proper control for information system has to be emergent, following the lead of Kevin Kelly and the intuition behind Autonomic Computing. Last, complex systems are everywhere when one tries to understand the most common business ecosystems, such as smartphone application development, smart homes or smart grids. I have talked about Smart Grids Players as a Complex System in this blog. More examples may be found in my keynote at CSDM 2012.

There is a paradox with the popularity of “complex systems science” in today’s business culture. On the one hand, the importance of complex systems’ concepts is obvious everywhere: system of systems, enterprises, markets. On the other hand, the practical insights are not so clear. « System thinking » has become a buzzword and the word “complexity” is everywhere … still many textbooks and articles which claim to apply “the latest of complex science theory” to business and management problems are either obscure or shallow. This is not to say that there does not exist a wealth of knowledge and practical insights that is available in complex systems literature. On the contrary, the following is a selection of some of the books which I have found useful during the last few years.

Today’s post is a crude and preliminary attempt to pick seven keys that I have found in these books which, to me at least, are practical in the sense that they unlock some of the complexity – or mystery – of the practical complex systems which I have encountered. There is no claim of completeness or rigorous selection. This is clearly a personal and subjective list which I consider a « work in progress ». This is just a list, so I will not develop each of the seven keys here, although each would deserve a blog post of its own.

Complexity means that forecasting is at best extremely slippery and difficult, and most often outright impossible. This is, for instance, the key lesson from Nassim Taleb’s books, such as The Black Swan. The non-linearity of complex system interactions causes the famed butterfly effect, in all kinds of disciplines. If you line up a series of queues, such as in the Beer Game supply chain example, each queue amplifies the variations produced by the previous one and the result is very hard to forecast, hence to control (this depends, obviously, of the system load). This does not mean that simulation of complex systems is useless, it means that is must be used for training as opposed to forecasting. Following Sun Tzu or François Jullien, one must practice “serious games” (such as war games) to learn about complex system from experience. This complexity also means that one needs as much data as possible to understand what is happening, and should beware of simplified/abstract description. “God is in the detail” has become a very popular business idiom in the last decades.
Complex systems most often live in a complex environment which makes homeostasis an (increasingly) complex feast of change management. Homeostasis describes the process through which a complex system continuously adapts to its changing environment. The characteristic of successful complex systems, in a business context, is the ability to react quickly, with a large range of possible reactions. This applies both at the level of what the system does and what it is capable of doing. This is illustrated by the rise of the word “agility” in the business vocabulary. The law of requisite variety tells us why detailed perception is crucial for a complex system (which is clearly exemplified by recent robots) : the system’s representation of the environment should be as detailed/varied as the sub-space from the outside environment that the homeostatis process needs to react to.
Complex systems, because of the non-linear interaction in general, and because its components have both memory and the capability to learn, exhibit statistical behaviors which are quite different from “classical” (Gaussian) distribution. This is one of the most fascinating insights from complex systems theory: fat tails (power laws) are the signature of intelligent behavior (such as learning). In classical physics or statistics, all individual events are (most often) assumed to be independent, which yields the law of large numbers and Gaussian distributions. But when the individual events are caused by actors who can learn or influence each other, this is no longer true. Rather than the obvious reference to Nassim Taleb, the best book I have read on this is The Physics of Wall Street. This works both ways: it warns us that “black swans” should be expected from complex systems, but also tells us that some form of coordinated behavior is probably at work when we observe a fat tail. There is another interesting consequence : small may be beautiful with complex systems, if adding many similar sub-systems creates un-foreseen complexity ! Classical statistics is all in favor of large scale and centralization (reduction of variability) whereas complex behavior may be better understood with a de-centralized approach. This is precisely one of the most interesting debates about the smart grids : if there is no feedback, learning and user behavior change, the linear nature of electricity consumption favors centralization (and large networks); if the opposite is true, a system of system approach may be the best one.
Resilience in complex systems often comes from the distribution of the whole system purpose to each of its subcomponents. This is another great insight from complex system theory: control needs to be not only distributed (to sub-systems) but also declarative, that is, the system’s purpose is distributed and the control (deriving the action from the purpose) is done “locally” (at the sub-system level). This idea of embedding the whole system’s purpose into each component is often referred as the holographic principle, with a nice hologram metaphor (in each piece of a hologram, there is a “picture” of the whole object). This principle has been proven many times experimentally with information systems’ design: it has produced “policy-based control”, where the goals/SLA/purposes are distributed in a declarative form (hence the word “policy”) to all sub-components. I gave the example of SlapOS in my IRT talk as a great illustration of this principle. This is also closely related to the need for fast reaction in the homeostasis process: agility requires distribution of control, with a bottom-up / networked organization similar to living organisms (for most critical functions). One of my favorite books which apply this to the world of enterprise organization is “Managing the Evolving Corporation” by Langdon Morris.
Efficiency in a complex system is strongly related to the capability to support information exchange flows. There is a wealth of information about the structure of information networks that best support these flows. Scale-free networks, for instance, occur in many complex systems, ranging from the Web to the molecular interactions in living cells and including social networks. Scale-free networks reduce the average diameter, among other interesting properties, and can be linked to avoiding long paths in communication chains, both for agility and resilience. The challenge that these information flows produce is represented by the product of the interaction richness (essence of complexity in a complex system) and the high frequency of these interaction (our key #2) – the product of two large numbers being an even larger number. My other blog is dedicated to the idea that managing the information flows is the most critical management challenge for the 21^st century (an idea borrowed from “Organizations” by March & Simon). For instance, the necessity to avoid long paths translates into versatility : complexity prevents specialization, because too much specialization generates even more synchronization flows. This communication challenge is not simply about capabilities (“the size of the communication pipes”), it is also about semantics and meaning. A common vocabulary is essential to most “systems of systems”, whether they are industrial systems or companies.
Complexity in time is something that is difficult to appreciate for humans. One of the most critical aspect of complex systems are the loops, mostly feedback loops. Peter Senge and John Sterman have written famous books about this. Reinforcement and stabilizing loops are what matter the most when trying to describe a complex system, precisely because of their non-linear natures. The combination of loops, memory and delays cause surprises to human observers. John Sterman gives many examples of overshooting, which happen when human over-react because of the delay. Kevin Kelly gives similar examples related to the management of wildlife ecosystem. The lesson from nature is a lesson of humility : we are not good at understanding delays and their systemic effects in a loop. In the world of business, we have a difficulty to understand long-term consequences of our actions, or simply to visualize long-term equilibriums. Many people think that user market share and sales market share should converge, given enough years, without seeing the bigger picture and the influence of attrition rate (churn). Even simple laws such a Little’s Law may produce counter-intuitive behaviors.
Efficient control for complex systems is an emergent property. Control strategies must be grown and learned, in a bottom-up approach as opposed to a top-down design. We are back to autonomous computing : top-down or centralized control does not work. It may be seen as another consequence of Ross Ashby law of requisite variety: complete control is simply impossible. Adaptive control required autonomy and leaning. This is, according to me, the key insight from Kevin Kelly’s book, Out of Control : “« Investing machines with the ability to adapt on their own, to evolve in their own directions, and grow without human oversight is the next great advance in technology. Giving machines freedom is the only way we can have intelligent control ». This insight is closely related to our key #4 : autonomy and learning transform progressively distributed policies into emergent control. There exists another corollary from this principle: such policies, or rules, should be simple, and the more complex the system, the simpler the rules. One could say that this is nothing more than the old idiom KISS, a battlefield lesson from engineering lore. But there is more to it, there seems to be a systemic law that is comforted by business experience: only simple explicit rules provide long-term values to complex systems. Any rule that is complex has to be implicit, that is constantly challenged and re-learned.

Sunday, October 20, 2013

Lean Startup & Lean Innovation Factory

I had the privilege to attend the Lean IT Summit in Paris a week ago, and was pleased to hear “The Lean Startup” mentioned in almost half of the talks. Actually, the Lean Startup is so popular that some are getting annoyed :) I co-wrote the preface of the French edition because I am a strong believer in the principles that Eric Ries explains in his book. However, with popularity comes exaggeration and re-interpretation. Here are two things I heard during the lean IT summit that got me annoyed as well:

The Lean Startup is what the lean community has expressed for a long time, with better words. Kudos to Eric Ries for being such a great communicator !
The Lean Startup is a lean reformulation of well-known innovation practices. Actually, innovation is in the genes of lean manufacturing, so no surprise there !

I disagree on both accounts:

The Lean Startup is not a book about lean, it’s a book about innovation, mostly startups but which is also relevant for larger companies, which is why I am such a strong advocate. After writing part of the preface, I ordered many dozens of the book which I have distributed freely in my own company. Sure, the lean framework gives a lot of sense to the overall contribution, but this is not the point.
Although many of the key ideas have been around for a while, the combination of these principles into a well-defined innovation process is a true contribution. It definitely goes against what most people believed to be innovation in larger companies. I had heard Eric Ries’s ideas expressed by a few VC from Silicon Valley, but they were anything but mainstream.

Hence this short post is about two things. The first part is a “Lean Startup for dummies” summary. It is by no means thorough nor complete, my French post from two years ago did a better job, but it is written for the corporate world and emphasizes what may be seen as “different”, at least compared with how “innovation” was described ten years ago, when we talked about “ideation factories”. The second part describes what I call “Lean Innovation Factory”, that is the application of Lean Startup principles to the innovation division of a large company.

1. Lean Startup for dummies

Eric Ries’s book deserves to be read because it is filled with meaningful examples. Therefore, a short summary cannot do justice to its content. Here I will only pick three key principles:

(a) Innovation is about doing, not about producing ideas

This principle is very similar to what the pretotyping manifesto promotes. The prototyping manifesto gave us these mottos: innovators beat ideas, pretotypes beat productypes, building beat talking… which all tell that the key part in innovation is the doing. This is especially true in the digital world, and is acknowledged by similar mottos from Google (“Focus on the user and all else will follow”, “Fast is better than slow”) or Facebook (“code wins”, ”done is better than perfect”). To innovate means, most of the times and above everything else, to meet a customer problem and to remove a pain point. Value creation occurs at the contact with the customer, not in a brainstorming room. This does not mean that ideation tools and techniques are not useful or important; it means that only “on the gemba” can we check that innovation actually works.

This is more revolutionary than it may sound for larger and older companies, which have associated the “innovation” word with “great ideas”. I have in my library dozens of book about innovation that distinguish between all kinds of innovation (according to the source of the “newness”) and that propose many processes for reaching all kinds of customers. The beauty of the lean startup framework is to simplify – so to speak, since value-creation-at-the-hands-of-the-customer is indeed hard – and to get rid of all the innovation funnels and ideation laboratory paraphernalia. What is clear to me after 15 years in the world of telecommunication service innovation is that everyone has the same ideas, the difference between success, failure and doing nothing (the most frequent case) is the quality of the execution process.

(b) Innovation requires iteration since nobody gets it right the first time.

This principle is often associated with the motto: fail fast to succeed sooner . In the Lean Startup world, it leads to the MVP: minimum viable product. Each word is important: a MVP is a product that may be placed in our customers’hands (this is not a prototype, it may be simple but it should not be fragile). A MVP is “viable” when it solves the customer’s problem. Its role is to jumpstart an iterative process of feedback collection, which may only happen if the customer finds a practical interest with the MVP, on the first day. A MVP is “minimal” because it is “as simple as possible but not simpler”, to paraphrase Einstein. This allows us to start the iteration as soon as possible, but not sooner. This emphasis on iteration echoes what a venture capitalist from Silicon Valley told me six years ago: there is no correlation between the success of a software startup and the quality of the piece of code that is shown to the early investors. On the other hand, there is a clear correlation between success and the ability to listen to the feedback of early customers and turn them into improvements.

This is also a bigger difference than one may think with the prevailing culture of large companies. It goes against the myth “you must get it right on the first time; you have only one chance to make the right impression”. The common culture of detailed market studies, coupled with the practice of lengthy marketing requirements, is replaced by a “hands-on” culture. MVP is a process that co-constructs software code, requirement and detailed specifications at the same time.

(c) A successful business model is built iteratively using customers’ feedbacks.

A successful business model is not a pre-condition but a post-condition for the innovation process. A startup is a “business model factory”; this is well understood today by the various startup “incubators” and “accelerators” and it may be acknowledged as one of Eric Ries’ contributions. To make a “business model factory” deliver, one needs three things. First, we need to set up measurement points in our MVP. We need to measure usage and value creation, that is, how the problem is being solved. Second, we need to build and then validate a value creation model, which Eric Ries calls innovation accounting. This is the direct application of the old saying “a measure is worth nothing without a model” (without a model, one does not known how to interpret a measure). This is an iterative process and not an exact science, where trials and errors is the common approach. On formulate hypotheses, which are either validated on invalidated by the collected measurements. Eric Ries is adamant in his book about preferring facts to opinions :). Last, when the model fails, the startup needs to “pivot”, that is to formulate a new value creation hypothesis. A key contribution from The Lean Startup is the wealth of examples and explanations regarding business models and pivoting.

This third principle is no less of a rupture with respect to the sanctity of the business case and its return on investment (RoI) that is observed in many large companies. It is simply not possible to formulate a credible business case when one starts to innovate. Obviously, one needs to start somewhere, hence there must be some initial hypotheses regarding value creation. However, the business model for the MVP is the result of an iterative process; the good news is that it comes with the validation provided by usage measures.

2. Lean Innovation Factory

I have started to use the term « Lean Innovation Factory » as a way to encapsulate principles from The Lean Startup applied to the innovation division of a large company, such as the one that I manage at Bouygues Telecom. The name Lean Innovation Factory (LIF) captures three ambitions:

(1) It is an innovation factory.

A “lean innovation factory” is a process that produces innovations. An innovation is a product or service that solves a problem, which is demonstrated in the hands of a customer. The process does not need to deliver a full-scale solution to prove its effectiveness, it can operate on a smaller set of customers, but only the “monitored feedback” of real users will validate the creation of innovative value. The emphasis is on “doing” and “building”; ideas have no glorified status in the Lean Innovation Factory, we strive for physical products and running software. We make ours the words of W. Edwards Deming : “In God we trust, all others must brings data”.

(2) It follows the “Lean Startup” principles.

The engine for creating value is the iteration of MVP feedback, which means that we strive to build the first MVP as quicky as possible (fail fast to succeed sooner), but while keeping the meaning of “viable” into our minds : the MVP is not a prototype, it is a product. We implement the heart of innovation accounting, in the sense that we measure feedback and we build continuously a value creation model that is validated or invalidated by our users.

(3) As a “factory”, the process is as important as the end result, because the result keeps changing while the strengths and the skills of the “factory workers” may build up.

This is the same pitch that I made for the “lean software factory”, and a reason for choosing a similar name :) To build a lean innovation factory is not only to build great product or service innovations, it means to build an organization that learns to do this better and better over time. This is clearly what Eric Ries tries to teach from his own experience with many startups, and where the link with "The Toyota Way" is the most evident.