The “return of Artificial Intelligence” is an impressive trend of the blogosphere. I spent quite some time and pleasure reading the two great posts from Tim Urban’s blog WaitButWhy entitled “The AI Revolution: The Road to Superintelligence” and “The AI Revolution : Immortality or Extinction” (part 1 and part 2). The core of these posts is the difference between ANI (Artificial Narrow Intelligence, what used to be called weak AI), AGI (Artificial General Intelligence, or strong AI) and ASI (Artificial Superintelligence). These posts are strongly influenced by Ray Kurzweil and his numerous books, but do a great job of collecting and sorting out conflicting opinions. They show a consensus in favor of the emergence of AGI between 2040 and 2060. I strongly recommend reading these posts because they are entertaining, actually quite deep and offer a very good introduction to the concepts that I will develop later on. On the other hand, they miss the importance of perception, emotions and consciousness, which I will address in this post.
The diversity of opinions is striking. On
the one hand, we have very enthusiastic advocates, such as Ray Kurzweil and his
book “How to
Create a Mind” which I will refer to later on, or Kevin Kelly with this
great Wired article: “The
Three Breakthroughs that have Finally Unleashed AI on the world”. For this
camp, strong AI is feasible, it’s coming soon and it’s good. Obviously Larry
Page is in that camp. On the other side, we find either people who simply
don’t believe in the feasibility of strong AI, such as Gerard Berry, famous for
saying that “computers
are stupid”, or people who are very worried of what it could mean for
mankind such as Stephen
Hawking, Bill
Gates or Elon
Musk, to name a few. One of the reason this topic is so hot in the Web is
that the investment race has started. Each major software company is investing
massively into AI, as is explained by Kevin Kelly in his paper. IBM and Watson
have started the race, while Google was acquiring massively companies in the
fields of AI and robotics. Facebook has a massive AI program that has
attracted a lot of attention. Kevin Kelly quotes Yahoo, Twitter, LinkedIn or
Pinterest for having invested into AI capabilities recently. There is no debate
about the tidal wave of ANI (weak form of AI), which is both depicted by Kevin
Kelly or Tim Urban. It’s already here, it’s working quite well and it’s
improving rapidly. The big race (and those who invest believe that there is a
game changer ahead) is to get first to the next generation of Artificial
Intelligence.
I decided to write my own opinion a couple
of months ago, first because I got tired to hear the same old arguments about
why strong AI was impossible, and also because I found while reading Tim Urban
or Kevin Kelly (to name a few) that some of the key ingredients to make it
happen were missing. For instance, there is too much emphasis on computing
power, which is a key factor but is not enough, in my opinion, to produce AGI,
even though I have read and appreciated Ray Kurzweil’s books. I must say that I
have been away too far from computer science to qualify as an expert in any
form. Let say that I am an educated amateur, because I started my career and my
PhD in the fields of knowledge representation, rule-based
systems and so-called expert systems. I have worked a long time ago on machine
learning applied to algorithm generation and then more recently on
intelligent agent learning with the GTES
framework. There is some irony for me when writing these pages, since one
of my first lecture when I was a student at the Ecole Normale Supérieure in
1984 was on the topic of AI replacing today’s workforce (to be compared with one
of my posts on the same topic last year).
In this post I will explore four ideas,
which seems to be, in my opinion, missing from what I have read during the past
few months:
- Speculating about AI algorithms today as a way to achieve strong AI is hazardous since these algorithms will be synthesized.
- True intelligence requires senses, it requires to perceive and experience the world. This is one of the key lesson from biology in general and neuroscience in particular from the last decades, I do not see why computer AI would escape this fate.
- A similar case may be made about the need for computer emotions. Contrary to what I have heard, artificial emotions are no more complex to embed than computer reasoning.
- Self-consciousness may be hard to code, but it will likely emerge as a property of the next generation complex systems. We are not taking about giving a “soul to a computer” but letting free will and consciousness of oneself in relation to time and environment become a key perceived feature of tomorrow smart autonomous systems, in the sense of the Turing test.
1. Artificial Intelligence is Grown, Not Designed
This is not a new idea. I have made Kevin
Kelly’s book “Out
of Control” a major reference for this
blog. The central idea of his book is that in order to create really intelligent
systems, you must relinquish control. This is true for weak and strong AI
alike. What makes this idea more relevant today is the combined availability of
massive computing power, massive storage and massive amounts of data. As I
explained in my
Big Data post where I quoted Thomas Hofmann, “Big Data is becoming at the core of computer
science”, the new way of designing algorithms is to grow them from massive
amounts of data. These new algorithms are usually “simple” (parts or whole are
sub-linear) in order to absorb really massive amounts of data (peta-bytes
today, much more tomorrow). One thing that we have learned from the past years
is that simpler algorithms trained on really huge corpus of evidence do better
than more complex algorithms trained on smaller samples. This has been shown in
machine translation, grammar checking and other machine learning domains.
One of the key AI algorithmic technology
of the moment is Convolutional
Neural Networks (CNN) and the emphasis is on “deep learning”. CNN are
a family of neural networks – trying to replicate the brain mechanism for
leaning from layers of neurons – characterized by the control of back-propagating
the information from the training set into the neuron network. For instance,
you may read Mark Montgomery entry on “recent
trends in artificial intelligence algorithms”. Deep Learning has received a
lot of media attention thanks to the success of Deep
Mind and his founder Demis Hassabis. Feedforward neural networks are a good
example of systems that are grown, not designed.
If you read carefully about the best
methods for speech recognition and language generation, you will see that you
need more than CPU power and large training sets, you actually need lots of
memory to keep that information “alive”. I borrow from Mark Montgomery the
citation from Sepp Hochreiter because he makes a very important point: “The advent of Big Data together with
advanced and parallel hardware architectures gave these old nets a boost such
that they currently revolutionize speech and vision under the brand Deep
Learning. In particular the “long short-term memory” (LSTM) network, developed
by us 25 years ago, is now one of the most successful speech recognition and
language generation methods”. Kevin Kelly attributes the “long awaited
arrival of AI” to three factors : cheap parallel computation, big data and
better algorithms. I obviously agree with those three, but his vision of “big
data” as the availability of large training set is too narrow.
I also do not believe that the current
algorithms of 2015 are indicative of what we will grow in 2040 when we have
massively superior computing and storage capabilities. History shows that the
mind follows the tool and that scientists adapt continuously to the new capacities
of their tools. We are still in the infancy stage, because our computing
capabilities are really very limited (more on this in the next section). Among
the skeptics in the computer science community are people who think – and I
need to agree – that being able to play old arcade games at a “genius level” is
still very far from a true step towards AGI. The ability to devise and explore
search and game strategies has been around for a long time in the AI community
(i.e., playing a game without the rules). Many of the critics about the
possibility of AI quote the difficulty to create, to produce art or to invent
new concepts. Here I tend to think the opposite, based on the last decade of seeing
computers used in music or mathematics (as a hint, I would like to quote Henri
PoincarĂ© : “Mathematics is the art of giving the same name to different things”).
Creation is not difficult to express as a program, it is actually surprisingly easy
and effective to write a program that explores a huge abstract space that
represent new ideas, new images or new music. The hard part is obviously to
recognize value in the creation, but computers are getting better at it.
2. A Truly Smart Artificial Intelligence Must Experience the World
The most common argument against strong AI
and “true” natural language processing, when I was still close to the
scientific AI community, was the “semantic problem”, that is, the difficulty to
associate a meaning to words in a computer program. What we have learned in the
last decades is that natural language cannot be understood through formal
methods. Grammar, syntactical rules, lexicography cannot help you much without a “semantic
reference" which is necessary to understand, even to disambiguate, many sentences
that make our everyday life. Somehow, one needs a phenomenology foundation to
understand humans and to be able to discuss convincingly.
The true revolution that is happening
gradually is that the Web may be used as this “phenomenology
foundation”. This was explained to me many years ago by Claude Kirchner
during a talk at the NATF : if you are a computer and need to think “from
experience” about a dog, why not used the network of millions of documents
returned by a Google search with the query “dog” as the phenomenology reference ? It
requires massive amounts of computing and storage, but it is more and more
feasible. In all its richness, diversity and links with other experiences, this
cloud of documents (text / image / video / ..) makes a solid foundation to answer common-knowledge-questions about
dogs. This is a departure from previous approaches where the huge amount of
sources available on the web is used to produce “abstractions” (concepts that
are represented by bit-vectors produced by techniques such as Latent Semantic
Indexing from my departed friend Thomas Landauer).
The idea here is to keep the whole network of document in memory as a substitute
for experiencing a dog. I am a little heavy here –one could say that it is lazy
deep learning -, because it is a key point when one wants to understand when we
may get strong AI widely available : it is not the same thing to have the whole
set of documents stored in your computer brain or to build a model through
training. This is, to me, a key point since we have learned from other
scientists that it is very hard to separate perception and thinking, as it is
hard to separate body and mind. An obvious reference that comes to mind is Alain Berthoz and his
work on sight (for instance, you may read his
book on decision).
As a first hint that having access to huge
amount of data builds the capability to understand texts, we have started to
see significant progress in natural language processing (NLP) and we are bound
to see much more when more storage and more processing power become available.
NLP is one of the key priority for the
Facebook AI program that I mentioned earlier. It is also a key priority for
Google, Apple
and many, many others. There are already a number of exciting signs that we are
making progress. For instance, computers can now play with words games, such as
the one that make IQ tests, better
than most humans. This is not yet an example of keeping all “experience
knowledge in memory”, but a sign that deep learning applied to massive amount
of data can work pretty well. Another sign that race towards NLP is raging is
the apparition of services that are mostly based on answering questions. The
obvious reference here is IBM Watson, but there are many other innovative
services that are popping up, such
as texting services on top of WeChat. Many of these texting/concierge
services are using a hybrid of human/robot assistance, waiting for technology
to become fully sufficient. I also hear a lot of frustration in my close circle
about the shortcomings of Google translate or Apple Siri, but the progress rate
is very impressive. If you are not convinced, read this fascinating article
about IBM Watson’s training. During a lecture which I attended last month, Andrew
McAffee used the graph (Figure 9) where you see the level of coverage/precision
reached by Watson versions after versions, as a great illustration of the power
of exponential technology growth.
This being said, one of the reason I am emphasizing the need for memory
is that the slowing down of DRAM capacity increase may happen faster than the suspected
decline of Moore’s Law. It turns out that there are many ways to continue increasing
the processing power, even if speed is closed to its limit and if integration
(reducing the transistor dimension) is also, in its two dimension version, not
so far from hitting hard limits. On the other hand, DRAM performance seems to
progress slower and with fewer routes to continue its growth. You may take a
look at the table or the following
chart to see that computer memory is progressing slower than processors,
who are progressing slower than disks (this last part is very well explained in
“The Innovator Dilemma”). Another way to look at it is as follows. I have been
waiting of 1 Pb (peta-byte) of memory on my PC for many years … in the early
90s, I had a couple of megabytes, today I have a couple of gigabytes. Even at the
previous CAGR of 35%, it may take 50 years to get there, which is why I am more
with the group of thinkers who predict of AGI occurrence in 2060, compared to the optimistic
group (2040). On the one hand, you could say that asking for one Pb is asking a lot (there
are many ways to
get this number, mine was simply 100K experiences time 10 Gbyte of
real-life data), but clearly considering that memory will continue to grow at the
same rate is too optimistic.
Linking
a computer to a very large set of “experiences” in one step, the next approach
is to build autonomous robots with their own senses. I often talk about the robotic arm from the
University of Tokyo which is about to catch an egg that is launched towards
it at full speed, and which is also able to play baseball with the accuracy of
a professional player. The reason for this engineering feast is not an
incredible algorithm, it is the incredible speed at which the robot sees the
world, at 50 thousands images per second. At that speed, the ball or the egg
moves very slowly and the control algorithm for the arm has a much easier job
to perform. Because of the importance of senses, experiences and perception, it
may be the case that we see faster progress from autonomous robots than cloud
AI as far as reaching AGI is concerned. One could say that the best way to
train an artificial intelligence is to let it learn by doing, by acting and exploring
with a full feedback circle (which is precisely what happens with
the DeepMind arcade games experiments). This may mean that autonomous
robots, which will clearly be fitted with exceptional perception senses – one may
think of Google autonomous car as an example – will be in the best situation to
grow an emergent strong form of artificial intelligence.
3. Learning and Decisions Require Emotions
To continue on what we can learn from biology and neurosciences, it seems clear that computers needs to balance different types of thinking to reach decisions on a large range of topics, in a way which will appear « intelligent » to us humans. A lot of my thinking for this section has been influenced by Michio Kaku’s book “The Future of the Mind”, but many other references could be quoted here, starting from Damasio’s bestseller “Descartes’ error”. The key insight from neuroscience is that we need both rational thinking from the cortex and emotional thinking to take decisions. Emotions seem mostly triggered by “pattern-recognition” low level circuitry of the brain and the nervous system. This distinction is also related to the system 1 / system 2 description of Kahneman. We seem to be designed to mix inductive and deductive logic.
Michio Kaku has a very
elegant way of looking at the role of emotions in the process of thinking. Emotions
are a “cost / evaluation” function that is hard-wired (through DNA) and has
evolved slowly through evolution (through DNA), to play two key roles. On the
one hand, emotions are a valuation function that is used as a meta-strategy to
search and to learn when we us the deductive, rational way thinking. For people
trained in optimization problems, emotions define the first level of the “objective
function”. However, as evolved creatures, we build our own goals, our own
desires and our own cost functions for new situations, that is, how we value
new experiences. The second role of emotions is to be the foundation (one could
say, the anchors) for the cost function that we grow through experience.
This is closely related
to a key cycle in biology which we could call the “learning cycle for living
beings”: pleasure leads to desire, desire to planning, planning to action,
actions lead to experiencing emotions, such as pleasure, fear, pain, etc. I
heard about this cycle a few years ago while attending a complex systems conference.
It seems to describe the learning loop for a large set of living beings, from
very simple to us humans. Emotions, both positive such as pleasure and negative
such as fear, play a key role in this cycle, from evaluating situations and
formulating plans. We can see that a similar design is relevant to the goal of generating strong artificial intelligence. It is clear that a truly smart system must be
able to generate its own goals, which is actually easy, as explained earlier. Simulating
“free will” from randomness is a simple task (very debatable from a philosophy
standpoint but efficient from a pragmatic one). However, intelligence in goal
generation requires to use an objective function that may evolved as the smart
system is learning. Computer emotions may be used as seeds (anchors) of this
objective function. For Michio Kaku, emotions are case-based heuristics that
have been finely tuned through Darwinian evolution to make us a more adaptive
species. Mixing emotions and reasoning is not really a new concept in AI. It is a way of mixing case-based reasoning,
in a “compiled form” that has been learned previously by previous generation of
software instances with logic deductive reasoning that is “interpreted” and
unique to each instance. This is clearly a multi-agent model (system1 vs
sysytem2) that reminds us of “The Society of Mind”
proposed by Marvin Minsky in 1986.
A great illustration of this idea proposed
by Mikio Kaku is the sense of humor, which may be described as our ability to
appreciate the difference between what we expect (the outcome of our own world
model simulation) and what happens. This is how magic tricks and jokes work. Because
we value this difference, we are playful creatures: we love to explore, to be
surprised, to play game. Kaku makes a convincing argument that the sense of
humor is a key evolution trait that favors
our learning ability as a living species. It is also very natural to think that
smart AIs, with a similar ability to plan ahead and simulate constantly what
they expect to happen, should be given a similar “sense of humor” (e.g.,
affinity for the unexpected) as a search “meta-strategy”. This remark also
brings back to the need for “emotions” to avoid danger (i.e., how we learn not
to play with fire). Kaku also sees the use of free will, in the sense of
exploiting some form of randomness – with the same debate whether it is “true”
freedom or a trick to use some form of biological pseudo-random generator -, as
a meta-strategy evolved as a Darwinian advantage for species competition. He
takes the hare as an example, which needed to develop random paths to avoid the
fox. But a more general case can be made from game theory where we know that
mixed strategies (that combine some form of choice or “free will”) fare better
in a competition that pure (deterministic) strategies. A similar and more
technical point could be made about the use of randomization in
search algorithms, which has been proven in the past decade to be an
effective meta-strategy.
I strongly recommend reading Michio Kaku’s book, which has a much larger scope than what is
discussed here. For instance, the pages about experiments at Berkeley to read
thoughts are very interesting. His insights about the role of emotions are
quite fascinating, and make a nice complement to Kurweil’s book which I’ll
discuss in the next section. To summarize and conclude this section, designing computer emotions is probably the best way to introduce some
form of control into an emergent reasoning autonomous system. Emotions are both
a bootstrap and a scaffolding mechanism for growing free will. They constitute
our first level of objective function, hard-wired together with the more primitive
senses signals such as pain. As we learn to derive more complex goals, plans
and ambitions, emotions are a control mechanism to keep the new objective
function within stable bounds. Emotions are somehow a simpler information
processing mechanism than the cortex deductive thinking (which is why they work
faster in our bodies) and they evolve at the species level, much more than the
individual level (we learn to control them, not to change them). This makes computer
emotions a mechanism that is far easier to control than emerging intelligence.
My intuition is that this will become a key area for autonomous smart robots.
4. Consciousness is an Emerging Property of Complex Thinking Systems
Another classical argument of skeptics about the possibility of strong AI is that computers, contrary to humans, will never be aware of their thinking, therefore not consciously aware of their actions. I disagree with this statement since I think that consciousness will emerge gradually as we build more complex AI systems with deeper reasoning and perceiving abilities (cf. Section 2: perceiving is as important as reasoning). I am aware (pun intended) that there are many ways to understand this statement and that the precise definition is where the hot debate stays. Here, my own thinking has been influenced by Ray Kurzweil’s book “How to create a mind”. Even if I do not subscribe with the complete story (i.e., that everything you need to create a mind is explained in this book), I found this book a great read for two reasons: it contains a lot of insights and substance about the story of NLP and AI, and it proposes a model for conscious reasoning which is both practical and convincing. As you may have guessed, my main concern with the approach proposed by Kurzweil is the weak role played by senses and emotions is his mind design.
What I envision is a progressive path towards consciousness:
- Self versus environment: the robot, or autonomous AI, is able to understand its environment, to see and recognize itself as part of the world (the famous “mirror test”).
- Awareness of thoughts: the robot can tell what it’s doing, why and how – it can explain its processing/ reasoning steps
- Time awareness : the robot can think about its past, its present and its future. It is able to formulate scenarios, to define goals and to learn from what actually happens compared to its prediction
- Choice consciousness: the robot is aware of its capability to make choices and creates a narrative (about its goals, its aspirations, its emotions and its experiences) that is a foundation for these choices. “Narrative” (story) is a vague term, which I use to encompass deductive/inductive/causal reasoning.
Although I see a progression, this is not a step-by-step hierarchy. It is an embedded set of capabilities that emerge when sensing, modeling and reasoning skills grow. Emergence of consciousness is a key element of Kurzweil’s book, as shown by this quote: “My own view, which is perhaps a subschool of panprotopsychism, is that consciousness is an emergent property of a complex physical system. In this view the dog is also conscious but somewhat less than a human”. The emergent characteristic also implies that it is difficult to characterize, and even more difficult to understand how it comes to be. However, once an AI has reached the four levels of conscious abilities that I just described, it is able to talk to us about self-awareness in a very convincing manner. One could object that this is a narrow, practical definition of consciousness, but I would say that it is the one that matters practically, for strong AI and autonomous robot applications. I will not touch in this post the key question of knowing if human consciousness is of the same nature (an emerging property of ourselves as a complex system, an essentially different characteristic of our species, or an attribute of our immortal soul). One of the hot questions about strong AI is the “hard problem of consciousness” defined by David Chambers. The “easy problems of consciousness” are self-awareness capabilities that Chambers and many others see as easily accessible to robots. “Hard problems” qualifies reflective thoughts about one’s experiences that seem harder to capture with a computer program/ Without trying to answer this hard question, it is clear to me that consciousness requires experience, hence the emphasis I have put on senses, perceptions and emotions. I also believe that, given sufficient complexity, sensing and reasoning capabilities, emergence may grow “artificial consciences” that will come close the “hard level of consciousness”. It is also clear that this will open a number of ethical issues about what we can and cannot do when we experiment with this type of strong AI program. For lack of time, I refer you to James Hughes’s book “Citizen Cyborg”, where the rights of emerging conscious beings are discussed.
5. Concluding Thoughts
There is much more that needs to be said, especially on the philosophical level about consciousness and the political level about the societal risks. So I will not risk a “conclusion”, I will conclude with a few thoughts. My previous post on this topic is almost 10 years old, but I have a keen intuition that many will follow sooner than 2025 :)
- First, it is clear now that weak AI, or ANI, is already there in our lives, and has been progressing for the last twenty years making these lives easier. The two articles from Tim Urban and Kevin Kelly that I mentioned in that post give a detailed account with plenty of evidence. I can also point out James Haight post “What’s next for artificial intelligence in the enterprise?”. Kevin Kelly emphasizes the advent of “AI as a service”, delivered from the cloud by a small set of world leaders. I think he has a fair point, there is clearly a first move/scale advantage that will favor IBM, Google and a few other large players.
- However, there are more opportunities than “smart thinking in the cloud”, (weak) AI is everywhere and will continue to be ubiquitous. Machine learning is already here in our smartphone and the next decades of Moore’s Law mean that connected objects and smart devices will be really smart.
- The race towards strong (or at least stronger) AI is on, as illustrated by the massive investments made by large players in that field. The next target is NLP (natural language processing) which is within our reach because of the exponential progresses of computing power, big data (storage capacity and availably of data) and deep learning algorithm.
- This is very disruptive topic. I do not agree with Kelly’s optimistic vision in his paper, nor with Ray Kurzweil. The disruption will start much earlier than the advent of the strong AI stage. For instance, the tidal wave of ANI may cause such a havoc as to make AGI impossible for decades. This could be either for ethical reasons (laws slowing down the access to AGI resources because of the concerns with what “weak” AI will be already able to do in a decade) or for political reasons (the turmoil created by massive jobs destructions due to automatization).
- Emotion and senses are part of the roadmap towards strong AI (AGI). Today’s focus is on cortex simulation as a model for future AI, but everything, from cognitive science to biology, suggests that it’s the complete nervous system from brain to body that will teach us how to grow efficient autonomous thinking. This is actually easier to state in a negative form: AI designed without emotions, through a narrow focus on growing cognitive and deductive thinking by emergent learning will most probably be less effective than a more balanced “society of minds” and almost certainly very hard to control.
- Consciousness will emerge along the way towards strong AI. It will happen faster than we think, but it will be more progressive (dog-level, child-level, adult-level, god-knows-what-level, …). Strong AI will not grow “in a box”, it will grow from constant and open interactions with a vast environment.