Sunday, December 20, 2015

Event-Driven Architecture and Biomimicry



1.  Introduction


Ten years ago I simultaneously discovered the concepts of Autonomic Computing  and the fascinating book “Out of Control – the New Biology of Machines, Social Systems and the Economic World” from Kevin Kelly. This came at a moment when I was still the CIO of Bouygues Telecom and getting puzzled with the idea of “organic operations”. I had become keenly aware that high availability and reliability were managed – on paper – using a mechanistic and probabilistic vision of system on engineering, while real-life crisis were solved using a more “organic” approach of how systems worked. This is described with more details in my first book. Autonomic computing gave me a conceptual framework for thinking about organic and self-repairing systems design. I then had the chance to learn about Google operations in 2006, including a long discussion with Urs HoĆ«lzle, and found that many of these ideas were already applied. It became clear that complex properties such as high-availability, adaptability or smart behavior could be seen as emergent properties that were grown and not designed, and this lead to the opening of this blog.

I decided to end this year with a post that fits squarely into this blog’s positioning – i.e., what can we learn from biological systems to design distributed information systems? - with a focus on event-driven architectures. The starting point for this post is the reading of the report “Inside the Internet of Things (IoT)” from Deloitte University Press. This is an excellent document, which I found interesting from a technology perspective, but which I thought could be expanded with a more “organic” vision of IoT systems design. The “Information Value Loop” proposed by Deloitte advocates for augmented intelligence and augmented behavior, which is very much aligned with my previous post on the topic of IoT and Smart Systems. The following schema is extracted from this report; it shows a stack of technology capabilities that may be added to the stream of information collected from connected objects. From a technologist’s standpoint, I like this illustration: it captures a lot of key capabilities without loosing clarity. However, it portrays a holistic, unified, structured vision which is too far, in my opinion, to the organic nature of Systems of Systems that  will likely emerge in the years to come.



The first section of this post will cover event-driven architectures, which are a natural framework for such systems. They also make perfect instances of “Distributed Information Systems” to which this blog is dedicated. The next section will introduce Complex Event Processing (CEP) as a platform for smart and adaptive behavior. I will focus mostly on how such systems should be grown and not designed, following the footsteps of Kevin Kelly. The last section will deal with the “cognitive computing” ambition of tomorrow’s smart systems. I will first propose a view that complements what is shown in the document from Deloitte, borrowing on biology to enrich the pattern of a “smart adaptive IoT system”. I will also advocate for a more organic, recursive, fractal vision of System of Systems design, in the spirit of the IRT SystemX.

I use the concept of biomimicry in a loose sense here, which is not as powerful or elegant as the real thing, as explained by Idriss Aberkane. In this blog, biomimicry (which I have also labelled as “biomimetism” in the past) means to look for nature as a source of inspiration for complex system design – hence the title of the blog. In today’s post, I will borrow a number of design principles for “smart systems of systems” from what I can read from biology about the brain or the human body, but a few of these principles directly come from readings about complex systems.

2. Event-Driven Architectures



Event-Driven Architectures (EDA) are well suited to design systems around smart objects, such as smart homes. Event-driven architectures are naturally scalable, open and distributed. The “Publication/Subscription” pattern is the cornerstone of application integration and modular system design. This was incidentally the foundation of application integration two decades ago, so there is no surprise that EDA has found its way back into SOA 2.0. I will not talk about technology solutions in this post, but a number of technologies such as Kafka or Smaza have appeared in the open source community that fit EDA systems. There is a natural fit to Internet of Things (IoT) – need for scalability, openness, decoupling – which is illustrated, for instance, in Cisco’s paper “Enriching Business Process through Internet of Everything”. Its reference to IfThisThenThat (IFTT), one of the most popular smart objects ecosystem, is a perfect example : IFTT has built its strategy on an open, API-based, event-driven architecture. The smart home protection service provided by myLively.com is another great instance of event-driven architecture at work to deliver a “smart experience” using sensors and connected devices.


In a smart system that adapts continuously to its environment, the preferred architecture is to distribute control and analytics. This is our first insight drawn both from complex systems and biological systems analysis. There are multiple possible reasons for this – because of the variety of control & analytics needs, because of the need for redundancy and reliability, because of performance constraints, … - but this should be taken more as an observation than a rational law (and it is more powerful as such). It is clear that “higher-level” control functions are more prone to errors and failure and that they typically react slower, which is why nature seems to favor redundant designs with multiple control engines and failover modes. Translated into the smart systems world, it means that we should both avoid single points of failure (SPOF) and single points of decisions (SPOD). In a smart home system, it is good to keep the control of the command layer if the automated system is down, and to keep the automated system on if the “smart” system is not operating properly. On the contrary, the distributed decision architecture designed decades ago by Marvin Minsky in his Society of Mind is a better pattern for robust smart systems. From a System of Systems design perspective, distributed control and analytics is indeed a way to ensure better performance (place the decision closer to where the action is, which recalls the trends towards edge computing, as exemplified by Cisco’s fog computing). It is also a way to adapt the choice of technology and analytics paradigms closer to the multiple situations that occur in a large distributed system.



A natural consequence of control distribution is the occurrence of redundant distributed storage. Although this is implicit in the Deloitte document, it is worth being underlined. Most complex control and decision systems require efficient access to data, hence distribution and redundancy are a matter of fact. Which leaves us with age-old data flows and synchronization issues (I write “age-old” since both the Brewer’s theorem or snapshot complexity show that these problems are here to stay).  This topic is out of scope of this post, but I strongly suggest the reading of the Accenture document “Data Acceleration : Architecture for the Modern Data Supply Chain”. Not only does the document illustrate the “flow dimension” of the data architecture, which is critical to design adaptive and responsive systems based on EDA, but it explains the concept of data architecture patterns that may be used in various pieces of a system of systems. There is a very good argument, if it was necessary, made for data caching, including main-memory systems. There are two pitfalls that must be avoided when dealing with data architecture design issues: focusing on a static view of data as an asset and searching for a unifying holistic design (more about this in the next section: hierarchical decomposition and encapsulation still have merit in the 21st century).

Smart biological systems operate on a multiplicity of time scales, irrespectively of the degree of “smartness”. What I mean by this is that smart living systems have developed control capabilities that operate on different time horizons: they are not different because of their deductive/inductive capabilities, but because their decision cycle runs on a completely different frequency. A very crude illustration of this idea could distinguish between reaction (short-term, emphasis on guaranteed latency), decision (still fast but less deterministic), adaptation (longer term). We shall see in the next section that the same distinction applies to learning, where adaptation could be distinguished from learning and reflection. Using the vocabulary of optimization theory, adaptation learns about the problem through variables adjustment, learning produces new problem formulation and reflection improves the satisfaction function. It is important to understand that really complex – or simple – approaches may be applied to each time scale: short-term is not a degraded version of long-term decision, nor is long-term an amplified and improved version of short-term. This is the reason for the now common pattern of the lambda-architecture which combines both hot and cold analytics. This understanding of multiple time scales is critical to smart System of Systems design. It has deep consequences, since most of what will be described later (goals, satisfaction criteria, learning feedback loops, emotion/pleasure/desire cycles) need to be thought about at different time scales. In a smart home, we equally need fast, secure and deterministic response time for user commands, smart behavior that requires complex decision capability and longer-term learning adaptive capabilities such as those of the ADHOCO system which I have quoted previously in this blog.
In this paper I consider a single system of its kind (even if a system of systems), but this should be further developed if the system is part of a population, which leads to collective learning (think of TESLA cars learning from one another) and population evolution (cf. Michio Kaku’s vision of emotion as Darwinian expression of population learning).

3. Emergent EDA Systems



Most systems produced by nature are hierarchical, this also applies to event architecture which must distinguish between different levels of events. Failure to do so results in systems that are too expensive (for instance, too much is stored) and too difficult to operate. For the architects reading this, please note that event “system-hierarchy” is not “event taxonomy” (which is supported out of the box by most frameworks), it is an abstraction hierarchy, not a specialization hierarchy (both are useful). A living organism uses a full hierarchy of events, some are very local, some gets propagated, some gets escalated to another scale of the system, etc. To distinguish between different levels of events, we need to introduce in smart systems what is known as Complex Event Processing (CEP). CEP is able to analyze and aggregate events to produce simple decisions which may trigger other events. You may find a more complete description of CEP in the following pages taken from theCEPblog, from which I have borrowed the illustration on the right. Similarly, you can learn a lot by watching YouTube videos of related open source technology platforms such as Storm.

A key feature of CEP is to be able to analyze and correlated events from a lower level to produce a higher level event. It is the foundation for event control logic in a “system of systems” architecture, moving from one level of abstraction to another. This is not, however, the unique responsibility of the CEP system. True to our “analytics everywhere” philosophy, “smarter” analytics systems, such as Big Data machine learning systems, need to be integrated onto the EDA, to participate to the smart behavior of the global system, in a fashion that is very similar to the organization of a living being.

Kevin Kelly’s advice for growing, rather than designing, emergent systems becomes especially relevant as soon as there exists a human in the loop. A key insight from smart system design is to let the system learn about the user and not the opposite (although one may argue that both are necessary in most cases, cf. the fourth section of my previous post). Systems that learn from their users’ behaviors are hard to design, it is easier to start from user feedback and satisfaction and let adaptation run its courses, than to get the “satisfaction equation” right from the first start. This is a key area of expertise of the IRT SystemX, which scientific and technology council I have the pleasure to lead. A number of ideas expressed here may be found in my inaugural talk of 2013. Emergence derives from feedback loops, which may be construed as “conversations”. CEP is the proper technology to develop a conversation with the user in the loop, following the insight of Chris Taylor who is obviously referring to the Cluetrain Manifesto’s “Markets are conversations”. The “complex” element of CEP is what makes the difference between a conversation (with the proper level of abstraction, listening and silence) and an automated response.

Another lesson from complex systems is that common goals should be reified (made first-class objects of the system) and distributed across smart adaptive distributed systems. There are two aspects to this rule. First, it states that complex systems with distributed control are defined by their “finality” which must be uniquely defined and shared. Second, the finality is transformed into actions locally, according to the local context and environment. This is both a key principle for designing Systems of Systems and a rule which has found its way to modern management theory. This is a lesson that has been discovered by distributed systems practitioners over and over. I found a vivid demonstration when working with OAI (optimization of application integration) over a decade ago. The best way to respect centrally-defined SLAs (service level agreements) is through policies that are distributed over the whole system and interpreted locally, as opposed to implementing a centralized monitoring system. This may be found in my paper about self-adaptive middleware. In the inaugural IRT speech that I mentioned earlier, I talked about SlapOS, the cloud programming OS, because Jean-Paul Smets told me a very similar story about SlapOS mechanism for maintaining SLA, which is also based on the distribution of goals, not commands. Commands are issued locally, in the proper context and environment, which is perfectly aligned with the control distribution strategy described in the previous section.

We should build intelligent capabilities the way nature builds muscles: by growing areas that are getting used. In the world of digital innovation, learning happens by doing. This simple but powerful ideas is a roadmap to growing emergent systems: start simple, observe what is missing but mostly reinforce what gets used. Example of reinforcement learning abound in biology from ants stigmergy to muscle growth through adaptation to efforts. Learning by doing is the heart of the lean startup approach, but it also applied to complex system design and engineering. This biology metaphor is well suited to avoid the pitfall of top-down feature-based design. Smart (hence emergent, if we follow Kevin Kelly’s axiom) systems must be grown in a bottom-up manner, by reinforcing gradually what matters most. This is especially useful when designing truly complex systems with cognitive capabilities (the topic of the next section). Nature tells us to think recursively (think fractal) and to grow from reinforcement (strengthening what is useful). If we throw a little bit of agile thinking into the picture, we understand why it’s better to start small when building an adaptive event-driven system.


4. Cognitive EDA Systems


As is rightly pointed by John E. Kelly from IBM, we are entering the new era of cognitive computing, with systems which grow by machine learning, not by programmatic design. This is precisely the vision of Kevin Kelly two decades ago. Cognitive systems, tells X. Kelly, “reason form a purpose”, which means that emergent systems emerge from their finality. The more the “how” is grown from experience (for instance, from data analysis in a Big Data setting), the more the definition and reification of goals become important (cf. previous section). One could argue that this already embedded into the Deloitte picture that I showed in the introduction, but there is a deeper transformation at work, which is why machine learning will play a bigger and more central role in IoT EDA architecture. I strongly suggest that you watch Dario Gill’s video about the rise of cognitive computing for IoT. His arguments about the usefulness of complex inferred computer model with no causality validation is very similar what is said in the NATF recently issued report on Big Data.

Biology obviously has a lot to teach us about cognitive, smart and adaptive systems. A simplistic view of our brain and nervous system distinguishes between different zones:
  • Reflexes (medulla oblongata and cerebellum) – these parts of the brain operate the unconscious regulation and the fine motor skills (cerebellum).
  • Emotions (amygdala) play a critical role in our decision process. There is an interplay between rational and emotional thoughts that has been popularized by Antonio Damasio’s best-seller. In a previous post, I referred to Michio Kaku’s analysis which makes emotion the equivalent of stored evaluation functions, honed through the evolution process.
  • Inductive thinking (cortex), since the brain is foremost a large associative memory.
  • Deductive thinking (front cortex), with a part of the brain that came later in the species evolution process and which is the last to grow in our own development process.

You may look at the previous link or at this one for more detailed information. I take this simplified view as input for the following pattern for cognitive event-driven architecture (see below). This is my own version of the introduction schema, with a few differences. Event-Driven architecture is the common glue and Complex Event Processing is the common routing technology. CEP is used to implement reflexes for the smart adaptive system that is connected with its environment (bottom part of the schema). Reflex decisions are based on rules wired with CEP but also on “emotions”, that is, valuation heuristics that are applied to input signals. Actions are either the result of reflexes or the result of planning. Goals are reified, as was explained in the previous section. This architecture pattern distinguishes between many different kinds of analytics and control capabilities. It should be made even richer if the multiple time-scale aspect was clearly shown. As said earlier, a number of these components (goals, emotions, anticipation) should be further specialized according to the time horizon under which they operate. Roughly speaking, one may recognize the earlier distinction between reflexes (CEP), decisions (with a separation between decision and planning, because planning is a specialized skill whereas decision could be left to a large type of Artificial Intelligence technology) and learning. Learning – which is meant to be covered by Big Data and Machine Learning capabilities – produces both adaptation (optimizing existing concepts) and “deep learning” (deriving news concepts). Learning is also leveraged to produce anticipation (forecasting), which is a key capability of advanced living beings. A specialized form of long-term learning, called reflection, is applied to question emotions versus long-term goals (reflection is a long-term process that assess the validity of the heuristic cost functions used to make short term decisions with respect to longer-term goals). Although this schema is a very simplified form of a learning system, it already shows multiple levels of feedback learning loops (meant to be used with different time scales).



It is important to notice that the previous picture is an incomplete representation of what was said in this post. The picture represents a pattern, which is meant to be declined in a “multi-scale” / “fractal” design, as opposed to a holistic system design view. Fractal architecture pattern was a core concept of the enterprise architecture book which I wrote in 2004. An organic design for enterprise architecture creates buffers, “isolation gateways” and redundancy that make the overall system more robust than a fully integrated design.

It is easier to build really smart smaller objects than large systems, thus they will appear first and “intelligence” will come locally before coming globally. This is the Darwinian consequence of the organic design principle. When one tries to develop a complex system in the spirit of the previous pattern, it is easier to produce with a more limited scope (input events, intended behaviors, …). Why, would you enquire ? Because intelligence comes from feedback loop analysis and it is easier to design and operate such a loop in a closed-system with a unique designer than with a larger-scope open system. Nothing in the previous schema says that it describes a big system. It could apply to a smart sensor or an intelligent camera. As a matter of fact, smart cameras such as Canary or Netatmo Welcome  are good examples of advanced cognitive functions integration. A consequence is that the “System of Systems” organic approach is more likely to leverage advanced cognitive capabilities than more traditional integrated or functionally specialized designs (which one might infer from the introduction Deloitte picture). Fog computing makes a good case for edge computing, but it also promote a functional architecture which I believe to be too homogeneous and too global.


 
Technorati Profile