1. Introduction
Ten years
ago I simultaneously discovered the concepts of Autonomic Computing and the fascinating book “Out
of Control – the New Biology of Machines, Social Systems and the Economic World”
from Kevin Kelly. This came at a moment when I was still the CIO of Bouygues
Telecom and getting puzzled with the idea of “organic operations”. I had become
keenly aware that high availability and reliability were managed – on paper –
using a mechanistic and probabilistic vision of system on engineering, while
real-life crisis were solved using a more “organic” approach of how systems
worked. This is described with more details in my first
book. Autonomic computing gave me a conceptual framework for thinking about
organic and self-repairing systems design. I then had the chance to learn about
Google operations in 2006, including a long discussion with Urs Hoƫlzle, and found
that many of these ideas were already applied. It became clear that complex
properties such as high-availability, adaptability or smart behavior could be
seen as emergent properties that were grown and not designed, and this lead to
the opening of this blog.
I decided
to end this year with a post that fits squarely into this blog’s positioning –
i.e., what can we learn from biological systems to design distributed
information systems? - with a focus on event-driven architectures. The starting
point for this post is the reading of the report “Inside
the Internet of Things (IoT)” from Deloitte University Press. This is an
excellent document, which I found interesting from a technology perspective,
but which I thought could be expanded with a more “organic” vision of IoT
systems design. The “Information Value Loop” proposed by Deloitte advocates for
augmented intelligence and augmented behavior, which is very much aligned with
my previous post on the topic of IoT
and Smart Systems. The following schema is extracted from this report; it
shows a stack of technology capabilities that may be added to the stream of
information collected from connected objects. From a technologist’s standpoint,
I like this illustration: it captures a lot of key capabilities without loosing
clarity. However, it portrays a holistic, unified, structured vision which is
too far, in my opinion, to the organic nature of Systems of Systems
that will likely emerge in the years to
come.
The first
section of this post will cover event-driven architectures, which are a natural framework
for such systems. They also make perfect instances of “Distributed Information
Systems” to which this blog is dedicated. The next section will introduce
Complex Event Processing (CEP) as a platform for smart and adaptive behavior. I
will focus mostly on how such systems should be grown and not designed, following
the footsteps of Kevin Kelly. The last section will deal with the “cognitive
computing” ambition of tomorrow’s smart systems. I will first propose a view
that complements what is shown in the document from Deloitte, borrowing on
biology to enrich the pattern of a “smart adaptive IoT system”. I will also
advocate for a more organic, recursive, fractal vision of System of Systems design, in the spirit of the IRT
SystemX.
I use the
concept of biomimicry in a loose sense here, which is not as powerful or
elegant as the real thing, as explained by Idriss Aberkane. In this
blog, biomimicry (which I have also labelled as “biomimetism”
in the past) means to look for nature as a source of inspiration for complex
system design – hence the title of the blog. In today’s post, I will borrow a
number of design principles for “smart systems of systems” from what I can read
from biology about the brain or the human body, but a few of these principles
directly come from readings about complex
systems.
2. Event-Driven Architectures
Event-Driven Architectures (EDA) are
well suited to design systems around smart objects, such as smart homes. Event-driven architectures
are naturally scalable, open and distributed. The “Publication/Subscription”
pattern is the cornerstone of application integration and modular system
design. This was incidentally the foundation of application
integration two decades ago, so there is no surprise that EDA has found its
way back into SOA 2.0.
I will not talk about technology solutions in this post, but a number of
technologies such as Kafka
or Smaza
have appeared in the open source community that fit EDA systems. There is a
natural fit to Internet of Things (IoT) – need for scalability, openness,
decoupling – which is illustrated, for instance, in Cisco’s paper “Enriching
Business Process through Internet of Everything”. Its reference to IfThisThenThat (IFTT), one of the most popular smart objects
ecosystem, is a perfect example : IFTT has built its strategy on an open,
API-based, event-driven
architecture. The smart home protection service provided by myLively.com is another great
instance of event-driven architecture at work to deliver a “smart experience”
using sensors and connected devices.
In a smart system that adapts continuously to
its environment, the preferred architecture is to distribute control and
analytics. This is
our first insight drawn both from complex systems and biological systems
analysis. There are multiple possible reasons for this – because of the
variety of control & analytics needs, because of the need for redundancy
and reliability, because of performance constraints, … - but this should be
taken more as an observation than a rational law (and it is more powerful as
such). It is clear that “higher-level” control functions are more prone to
errors and failure and that they typically react slower, which is why nature
seems to favor redundant designs with multiple control engines and failover
modes. Translated into the smart systems world, it means that we should both
avoid single points of failure (SPOF) and single points of decisions (SPOD). In
a smart home system, it is good to keep the control of the command layer if the
automated system is down, and to keep the automated system on if the “smart”
system is not operating properly. On the contrary, the distributed decision
architecture designed decades ago by Marvin Minsky in his Society of Mind is a
better pattern for robust smart systems. From a System of Systems design
perspective, distributed control and analytics is indeed a way to ensure better
performance (place the decision closer to where the action is, which recalls
the trends towards edge computing, as exemplified by Cisco’s fog computing). It is
also a way to adapt the choice of technology and analytics paradigms closer to
the multiple situations that occur in a large distributed system.
A natural consequence of control distribution
is the occurrence of redundant distributed storage. Although this is implicit in the Deloitte
document, it is worth being underlined. Most complex control and decision
systems require efficient access to data, hence distribution and redundancy are
a matter of fact. Which leaves us with age-old data flows and synchronization issues
(I write “age-old” since both the Brewer’s theorem or snapshot
complexity show that these problems are here to stay). This topic is out of scope of this post, but I
strongly suggest the reading of the Accenture document “Data
Acceleration : Architecture for the Modern Data Supply Chain”. Not only
does the document illustrate the “flow dimension” of the data architecture,
which is critical to design adaptive and responsive systems based on EDA, but
it explains the concept of data architecture patterns that may be used in
various pieces of a system of systems. There is a very good argument, if it was
necessary, made for data caching, including main-memory systems. There are two
pitfalls that must be avoided when dealing with data architecture design
issues: focusing on a static view of data as an asset and searching for a
unifying holistic design (more about this in the next section: hierarchical
decomposition and encapsulation still have merit in the 21st
century).
Smart biological systems operate on a
multiplicity of time scales, irrespectively of the degree of “smartness”. What I mean by this is that smart
living systems have developed control capabilities that operate on different
time horizons: they are not different because of their deductive/inductive
capabilities, but because their decision cycle runs on a completely different
frequency. A very crude illustration of this idea could distinguish between
reaction (short-term, emphasis on guaranteed latency), decision (still fast but
less deterministic), adaptation (longer term). We shall see in the next section
that the same distinction applies to learning,
where adaptation could be distinguished from learning and reflection. Using the
vocabulary of optimization theory, adaptation learns about the problem through
variables adjustment, learning produces new problem formulation and reflection
improves the satisfaction function. It is important to understand that really
complex – or simple – approaches may be applied to each time scale: short-term
is not a degraded version of long-term decision, nor is long-term an amplified
and improved version of short-term. This is the reason for the now common
pattern of the lambda-architecture
which combines both hot and cold analytics. This understanding of multiple time
scales is critical to smart System of Systems design. It has deep consequences,
since most of what will be described later (goals, satisfaction criteria,
learning feedback loops, emotion/pleasure/desire
cycles) need to be thought about at different time scales. In a smart home, we
equally need fast, secure and deterministic response time for user commands,
smart behavior that requires complex decision capability and longer-term
learning adaptive capabilities such as those of the ADHOCO
system which I have quoted previously in this blog.
In
this paper I consider a single system of its kind (even if a system of
systems), but this should be further developed if the system is part of a
population, which leads to collective learning (think of TESLA cars learning
from one another) and population evolution (cf. Michio
Kaku’s vision of emotion as Darwinian expression of population learning).
3. Emergent EDA Systems
Most
systems produced by nature are hierarchical,
this also applies to event architecture
which must distinguish between different levels of events. Failure to do so
results in systems that are too expensive (for instance, too much is stored)
and too difficult to operate. For the architects reading this, please note that
event “system-hierarchy” is not “event taxonomy” (which is supported out of the
box by most frameworks), it is an abstraction hierarchy, not a specialization
hierarchy (both are useful). A living organism uses a full hierarchy of events,
some are very local, some gets propagated, some gets escalated to another scale
of the system, etc. To distinguish between different levels of events, we need
to introduce in smart systems what is known as Complex Event
Processing (CEP). CEP is able to analyze and aggregate events to produce
simple decisions which may trigger other events. You may find a more complete
description of CEP in the
following pages taken from theCEPblog,
from which I have borrowed the illustration on the right. Similarly, you can
learn a lot by watching YouTube videos of related open source technology
platforms such as Storm.
A key
feature of CEP is to be able to analyze and correlated events from a lower
level to produce a higher level event. It is the foundation for event control
logic in a “system of systems” architecture, moving from one level of
abstraction to another. This is not, however, the unique responsibility of the
CEP system. True to our “analytics everywhere” philosophy, “smarter” analytics
systems, such as Big Data machine learning systems, need to be integrated onto
the EDA, to participate to the smart behavior of the global system, in a
fashion that is very similar to the organization of a living being.
Kevin Kelly’s advice for growing, rather than
designing, emergent systems becomes especially relevant as soon as there exists
a human in the loop. A key insight from smart system
design is to let the system learn about the user and not the opposite (although
one may argue that both are necessary in most cases, cf. the fourth section of
my previous post). Systems that learn from their users’ behaviors are hard
to design, it is easier to start from user feedback and satisfaction and let
adaptation run its courses, than to get the “satisfaction equation” right from
the first start. This is a key area of expertise of the IRT
SystemX, which scientific and technology council I have the pleasure to
lead. A number of ideas expressed here may be found in my inaugural
talk of 2013. Emergence derives from feedback loops, which may be construed
as “conversations”. CEP
is the proper technology to develop a conversation with the user in the
loop, following the insight of Chris Taylor who is obviously referring to the Cluetrain
Manifesto’s “Markets are conversations”. The “complex” element of CEP is
what makes the difference between a conversation (with the proper level of
abstraction, listening and silence) and an automated response.
Another lesson from complex systems is that
common goals should be reified (made first-class objects of the system) and distributed across smart adaptive distributed systems. There
are two aspects to this rule. First, it states that complex systems with
distributed control are defined by their “finality” which must be uniquely
defined and shared. Second, the finality is transformed into actions locally,
according to the local context and environment. This is both a key principle
for designing Systems
of Systems and a rule which has found its way to modern
management theory. This is a lesson that has been discovered by distributed
systems practitioners over and over. I found a vivid demonstration when working
with OAI (optimization
of application integration) over a decade ago. The best way to respect
centrally-defined SLAs (service level agreements) is through policies that are
distributed over the whole system and interpreted locally, as opposed to
implementing a centralized monitoring system. This may be found in my paper
about self-adaptive
middleware. In the inaugural IRT speech that I mentioned earlier, I talked
about SlapOS, the cloud programming OS,
because Jean-Paul
Smets told me a very similar story about SlapOS mechanism for maintaining
SLA, which is also based on the distribution of goals, not commands. Commands
are issued locally, in the proper context and environment, which is perfectly
aligned with the control distribution strategy described in the previous
section.
We should build intelligent capabilities the
way nature builds muscles: by growing areas that are getting used. In the world of digital
innovation, learning
happens by doing. This simple but powerful ideas is a roadmap to growing
emergent systems: start simple, observe what is missing but mostly reinforce
what gets used. Example of reinforcement learning abound in biology from ants stigmergy to muscle growth through
adaptation to efforts. Learning by doing is the heart of the lean
startup approach, but it also applied to complex system design and
engineering. This biology metaphor is well suited to avoid the pitfall of
top-down feature-based design. Smart (hence emergent, if we follow Kevin Kelly’s
axiom) systems must be grown in a bottom-up manner, by reinforcing gradually
what matters most. This is especially useful when designing truly complex
systems with cognitive capabilities (the topic of the next section). Nature
tells us to think recursively (think fractal) and to grow from reinforcement
(strengthening what is useful). If we throw a little bit of agile
thinking into the picture, we understand why it’s better to start small
when building an adaptive event-driven system.
4. Cognitive EDA Systems
As is rightly pointed by John E. Kelly from
IBM, we are entering the
new era of cognitive computing, with systems which grow by machine
learning, not by programmatic design. This is precisely the vision of Kevin Kelly
two decades ago. Cognitive systems, tells X. Kelly, “reason form a purpose”,
which means that emergent systems emerge from their finality. The more the “how”
is grown from experience (for instance, from data analysis in a Big Data
setting), the more the definition and reification of goals become important
(cf. previous section). One could argue that this already embedded into the
Deloitte picture that I showed in the introduction, but there is a deeper
transformation at work, which is why machine learning will play a bigger
and more central role in IoT EDA architecture. I strongly suggest that you
watch Dario Gill’s video about the
rise of cognitive computing for IoT. His arguments about the usefulness of
complex inferred computer model with no causality validation is very similar
what is said in the NATF
recently issued report on Big Data.
Biology obviously has a lot to teach us about
cognitive, smart and adaptive systems. A simplistic view of our brain and nervous system distinguishes
between different zones:
- Reflexes (medulla oblongata and cerebellum) – these parts of the brain operate the unconscious regulation and the fine motor skills (cerebellum).
- Emotions (amygdala) play a critical role in our decision process. There is an interplay between rational and emotional thoughts that has been popularized by Antonio Damasio’s best-seller. In a previous post, I referred to Michio Kaku’s analysis which makes emotion the equivalent of stored evaluation functions, honed through the evolution process.
- Inductive thinking (cortex), since the brain is foremost a large associative memory.
- Deductive thinking (front cortex), with a part of the brain that came later in the species evolution process and which is the last to grow in our own development process.
You may
look at the previous link or at this one
for more detailed information. I take this simplified view as input for the
following pattern for cognitive event-driven architecture (see below). This is
my own version of the introduction schema, with a few differences. Event-Driven
architecture is the common glue and Complex Event Processing is the common
routing technology. CEP is used to implement reflexes for the smart adaptive
system that is connected with its environment (bottom part of the schema).
Reflex decisions are based on rules wired with CEP but also on “emotions”, that
is, valuation heuristics that are applied to input signals. Actions are either
the result of reflexes or the result of planning. Goals are reified, as was
explained in the previous section. This architecture pattern distinguishes
between many different kinds of analytics and control capabilities. It should
be made even richer if the multiple time-scale aspect was clearly shown. As
said earlier, a number of these components (goals, emotions, anticipation) should
be further specialized according to the time horizon under which they operate.
Roughly speaking, one may recognize the earlier distinction between reflexes (CEP), decisions (with a separation between decision and planning, because
planning is a specialized skill whereas decision could be left to a large type
of Artificial Intelligence technology) and learning.
Learning – which is meant to be covered by Big Data and Machine Learning
capabilities – produces both adaptation (optimizing existing concepts) and “deep
learning” (deriving news concepts). Learning is also leveraged to produce anticipation
(forecasting), which is a key capability of advanced living beings. A
specialized form of long-term learning, called reflection, is applied to
question emotions versus long-term goals (reflection is a long-term process
that assess the validity of the heuristic cost functions used to make short
term decisions with respect to longer-term goals). Although this schema is a
very simplified form of a learning system, it already shows multiple levels of
feedback learning loops (meant to be used with different time scales).
It is
important to notice that the previous picture is an incomplete representation
of what was said in this post. The picture represents a pattern, which is meant
to be declined in a “multi-scale” / “fractal” design, as opposed to a holistic
system design view. Fractal architecture pattern was a core concept of the
enterprise architecture book which I wrote in 2004. An organic design for
enterprise architecture creates buffers, “isolation gateways” and redundancy
that make the overall system more robust than a fully integrated design.
It is easier to build really smart smaller
objects than large systems, thus they will appear first and “intelligence” will
come locally before coming globally. This is the Darwinian consequence of the organic
design principle. When one tries to develop a complex system in the spirit of
the previous pattern, it is easier to produce with a more limited scope (input
events, intended behaviors, …). Why, would you enquire ? Because intelligence
comes from feedback loop analysis
and it is easier to design and operate such a loop in a closed-system with a
unique designer than with a larger-scope open system. Nothing in the previous
schema says that it describes a big system. It could apply to a smart sensor or
an intelligent camera. As a matter of fact, smart cameras such as Canary
or Netatmo Welcome are good examples of advanced cognitive
functions integration. A consequence is that the “System of Systems”
organic approach is more likely to leverage advanced cognitive capabilities than
more traditional integrated or functionally specialized designs (which one
might infer from the introduction Deloitte picture). Fog
computing makes a good case for edge computing, but it also promote a
functional architecture which I believe to be too homogeneous and too global.