Sunday, July 7, 2024

How Software-Driven Companies Leverage AI through Software Excellence

  


1. Introduction


In 2011 Marc Andreesen shared with the word a famous article, “Why Software is Eating the Word”, that has become a foundation for thinking about the ongoing digital transformation. It echoed the “Web squared” transformation that has been identified a decade earlier: As everything becomes digital, we generate a digital image of the world that becomes more and more complete, and that digital “twin” becomes the support for an ever-increasing share of our decisions and actions. This is the consequence of two on going transformation: every step of our process, from cooking a loaf of bread to manufacturing a car become the source of a digital flow because of the ubiquity of sensors and the process automation that started decades ago is digitized through actuators and controlled motors (i.e., robots are everywhere and most of them are invisible). This “world digital twin” revolution comes with the fact that software can control and optimize everything that we do, whether we are talking about the material economy, the immaterial and service economy or the knowledge economy.

A Software-Driven Company is any company that leverages its software engineering capabilities to differentiate its products and services. At Michelin, we see ourselves as a software-driven company because what we do, tires, services or new materials get their performance advantage over our competitors because of our excellence with R&D, manufacturing, and enrichment with services through cutting-edge and unique capabilities with simulation, automation and knowledge engineering through software. If you agree with the statement that “software is eating the world”, it becomes clear that each company will become a software-driven company to succeed in a competitive world, whichever domain of expertise it is part of. Not every company needs to become a software company – though in the world of knowledge, immaterial and service economy, this is the case – but every company must become a software-driven company to survive in the 21st century.

This blog post will address the use of Artificial Intelligence since this is the hot topic of the decade. Leveraging AI is nothing more than saying “Better software is eating the world” : AI is the next logical evolution of better management and control through more intelligent software. As NVIDIA CEO, Jensen Huang, declared in 2017, “Software is eating the world, but AI is going to eat software”. This may be understood in two ways that I will both address in this post: AI is fueling the new generation of software that is managing our activities (“intelligent digital twin”) and AI is finding its place in the software engineering process (i.e., this “better software” is produced with the help of AI. To point out the obvious for all companies competing in this century:

  • You may only apply AI to parts of your processes that are digitized, so digital transformation defines the landscape of AI value creation. A sharp reader might object that even with a 100% material process, AI may be used with the knowledge engineering dimension of that process, but the bulk of the opportunity lies in the digitized world
  • Delivering AI value means delivering software products to user or to digital systems (most often, to both).

The following illustration is a summary of a LinkedIn post that emphasizes the role of software in AI value creation. It emphasizes why being a software-driven company is critical: first AI is, as noticed earlier, embedded into software systems. These systems are grown from user and ecosystem interactions (the heart of what excellence means in a digital world: constant learning and fast iterations). This excellence in system engineering, software engineering and data engineering is a must.

 


 

 

This leads us to the obvious consequence that success in our modern world demands software engineering excellence. Excellence for a software-driven company is built on the competitive performance of delivering flows of software products, constant increments of values packages as software updates.  As I have noticed in many previous posts, this is no longer a topic for debate. The best-selling book “Accelerate: Building and Scaling High Performance Organization” is based on the correlation between software delivery performance and competitive value delivery. The blueprint for building a world-class software delivery organization is now well understood, even if it remains a difficult transformation. This transformation is mix of processes and techniques (lean, agile, devops), of tools and practices (CICD, IaC) and the mindset for craftsmanship and continuous learning (hence the lean roots).  I can refer the reader to my last book, “The Lean Approach to Digital Transformation: from customer to code  and from code to customer,” or to the newer book from Fabrice Bernhard and Benoît Charles-Lavauzelle The Lean Manifesto: Learn the Secrets of Tech Leaders to Grasp the Full Benefits of Agile at Scale”.

In a fast-changing work, software engineering excellence requires to apply the recursively the principes of “AI is eating the world” to software development itself and to leverage “augmented software engineering”. Consequently, this blog post is organized as follows. Section 2 talks about software craftsmanship.  This a topic that I have covered in a previous blog post, “Software Craftsmanship through Beautiful Code”, but I want to emphasize the business value of code elegance through some new results and add a few principles and observation to the corpus of “what beautiful code might be”. Section 3 returns to the topic of Software Driven Companies.  A software-driven companies is defined foremost by its culture and mindset; the technical abilities come second. Everyone needs to show interest in software, from technologies to ecosystems, at a broad and general level to understand the opportunities that keep opening as software is eating the world. Section 4 then addresses the question of “Augmented Software Engineering”, that is how to leverage AI in general, and generative AI in particular, to develop faster and better software products. As the field is evolving so rapidly, and because genAI code generation is still in its infancy, this will be a short incomplete essay, offered as food for thoughts, more than a definite analysis.

 

2. Software Craftsmanship


To add to what I have previously written, I will start with a few ideas from the article “The Business Impact of Code Quality” from Adam Thornhill and Markus Borg. If you have the time, I strongly recommend to watch the associated talk from Adam Thornhill. To quote from the paper: “Code quality remains an abstract concept that fails to get traction at the business level … The resulting technical debt is estimated to waste up to 42% of developers’ time”. The authors analyzed 39 proprietary production code bases, from source code to issue information from Jira. They used their own tool based on their Code Heath metric that incorporates elements from the ISO 5055 maintainability standard and several code smells detection. The paper is worth reading in full, but the main findings are that (1) higher quality code leads to fewer issuer and lower code maintenance costs, (2) higher quality code improves the predictability of development time, (3) higher quality code supports the reduction of lead time, in a very significant manner since low quality code yields to 124% longer time-in-development compare to healthy code. In their conclusion, the authors state that “Our results indicate that improving code quality could free existing capacity; with 15 times fewer bugs, twice the development speed, and substantially more predictable issue resolution time, the business advantage of code quality should be unmistakably clear”.


The importance of code quality and software craftsmanship to business excellence in the context of digital transformation is a central theme of my last book, “The Lean Approach to Digital Transformation”. Adam Thornhill’s finding help to put some metrics and facts to corroborate the idea that “code elegance” (ease of changing, sharing and maintaining) has a true business value. To summarize what I expressed in a previous blog post, in the digital age, software systems are characterized by continuous change and adaptation, necessitating a love for code and a departure from the traditional "black box" approach. This shift highlights the importance of readable and maintainable code, as "ugly code" leads to hidden complexities and increased costs. Agile methodologies, which emphasize iteration, must be paired with refactoring to manage the inevitable accumulation of obsolete elements. Elegance in code fosters simplicity and adaptability, akin to Zen gardening. As AI increasingly collaborates with human intelligence, data becomes the new code, further emphasizing the need for elegant, maintainable meta-programmed processes. Collaboration is crucial in this environment, with practices like code reviews and pair programming enhancing software quality. Open-source ecosystems thrive on the elegance of shared code, which facilitates communication and collaboration. The pursuit of simplicity, inspired by principles like Occam's razor, is essential for creating antifragile, adaptable systems. Simplifying code reduces technical debt and inertia, aligning with lean principles and ensuring faster response times. The ongoing challenge is managing the increasing complexity of software while maintaining elegance and simplicity.

As mentioned by Adam Thornhill, technical debt is the enemy of business agility and digital service excellence. Technical debt, first articulated by Ward Cunningham, refers to the long-term costs and inefficiencies resulting from quick-and-dirty shortcuts or suboptimal software architecture. This debt can be managed either by paying "interest"—incurring higher maintenance costs—or by "paying off" the debt through refactoring. Technical debt becomes evident when the need for change arises, as it hinders productivity and increases the time and effort required to bring software up to current standards. The concept includes both the tangible costs, such as maintenance and support, and intangible costs like increased complexity and risk of failure. Complexity, especially in legacy systems, adds to technical debt, making integration of new features costly and challenging. Agile methodologies, while promoting iteration and adaptability, also necessitate regular refactoring to manage accumulating debt. Effective management of technical debt involves maintaining simplicity, modularity, and a clear architecture, coupled with practices like code reviews, testing, and reducing dependencies. These strategies ensure that software systems remain agile and sustainable, minimizing the detrimental impact of technical debt on long-term performance and adaptability.

Besides technical debt, what I describe as “elegance” means the capacity for a piece of code to be easily understood and shared with other developers. It has been said that “programming is what you do on your own, software engineering is what you do in space and time”, that is collaborating with other programmers, including your own team in the future. Easiness to read or share is often referred as “beautiful code”, easiness to maintain goes deeper into the software architecture (as we shall see next). From a great curated list of “key programming essays” by Ben Khun, I could emphasize the three following great contributions:


  1. As software engineers, we do not think enough about the end of life of our products, modules, components or lines of code. Hence the relevance of this essay : “write code that is easy to delete, not easy to extend”.
  2. Good code is meant to support great products, hence the importance of being product-minded, the necessary change of perspective for migrating from project to product. Gergely Orosz post on “The Product-Minded Software Engineer” explains that Product-minded engineers proactively contribute ideas and challenge specifications, driven by a deep interest in the business, user behavior, and data. Their curiosity leads them to understand the "why" behind product decisions, often seeking information independently while maintaining strong relationships with non-engineers. They are effective communicators who provide valuable product and engineering tradeoffs, balancing both perspectives to optimize outcomes.
  3. Maintainable code is grown from experience, notably from other developer’s experience. A great example is this set of pieces of advice about distributed systems from Jeff Hodges. Reading, understanding and applying what is described in this paper is a practical way to improve maintainability (in addition to robustness and performance).

I wrote a previous post about “software craftsmanship through beautiful code”, in which I proposed a quick review of a couple of books including these two :

  • "Clean Code: A Handbook of Software Craftsmanship" by Robert "Uncle Bob" Martin, published in 2008, remains a seminal work on software craftsmanship despite its age. The book emphasizes the importance of writing clean, readable, and maintainable code through practical advice and principles. Martin argues that mastering clean code requires rigorous practice and learning from both personal and others' mistakes. The book covers various aspects of clean coding, such as proper naming conventions, short and purposeful functions, effective commenting, and thoughtful formatting. It also highlights the significance of modular engineering, continuous testing, and refactoring to maintain code quality. Martin aligns software craftsmanship with the lean 5S philosophy—organization, tidiness, cleaning, standardization, and discipline—underscoring the need for collaborative platforms and an ecosystem approach in modern software development. Through principles like SOLID, the book provides a framework for creating code that is not only functional but also elegant and efficient, advocating for a disciplined approach to achieve software excellence.
  • "Beautiful Code – Leading Programmers Explain How They Think," edited by Andy Oram and Greg Wilson, is a collection of essays by top programmers who discuss their favorite software pieces to illustrate what "beautiful code" means to them. The book emphasizes the importance of writing code that is useful, generic, elegant, and efficient, aligning with principles of readability, maintainability, and minimalism. The essays highlight diverse programming domains, such as bioinformatics, NASA’s Mars Rover Mission, and the CERN library, showcasing how well-crafted code fosters collaboration and sustainability. Key attributes of beautiful code include consistent style, clear naming, concise functions, and well-designed data structures. The book also stresses the role of modular architecture and patterns, like recursion and functional programming, in achieving reusable and robust software systems. Additionally, it underscores the importance of testing, offering insights into creating simple yet effective tests. Despite some complex optimizations, the overall goal remains making code readable and understandable, reflecting the diverse experience levels of developers. Beautiful code ultimately combines elegance with functionality, balancing performance and maintainability.

 


To expand on that previous blog post, I would like to add a brief review of “Good Code, Bad Code : Think like a Software Engineer”, by Mark Long. This quote on the cover gives a good idea of the book’s motto: “Software development is a team sport. For an application to succeed, your code needs to be robust and easy for others to understand, maintain and adapt”. This book contains lots of ideas that are also expressed in the previously quoted references, for instance about how to make your code readable: Mark Long gives some interesting advice about naming (variables, functions and classes), with an interesting warning about concision (not always a form of elegance). Many classical pieces of advice, such as to avoid deep nesting and to adopt a consistent coding style, are also proposed. For instance, “make functions small and focused is one of the best ways to ensure that code is readable and reusable”, or make sure that anonymous functions (when using a functional programming style) are used for small and simple things. I will the reader discover why comments matter (I do agree with Mark long, but this is a long-debated topic). There are some great pages on how to properly use errors, because “often only the callers know if the error can be recovered from”, and some general principles about modularity (objects that evolve together should be grouped together) or the importance of encapsulation (hiding concrete data types through interfaces) and how to avoid leakage. In addition to these well-established ideas, a few topics are introduced with more substance than in the other books that I have quoted in this blog:

  1. Mark Long borrow the concept of “poka yoke” from Lean and explain how to make code “hard to use the wrong way” by making implicit design decisions explicit or by avoiding ‘magic values” (a single value with a non-standard unique semantic). As he explains:make sure that your code is understandable to someone with little or no context, and make it hard to break. You’ll not only be doing everyone else a favor, but you’re been doing your future self one too”.
  2. The value of layered abstraction – This is really old and fundamental principle of system design, but it stays very relevant, from class/interface hierarchical design to module or service architecture: “Microservices can be an extremely good way to break up a system and make it more modular, but it usually does not change the fact that we still need … to create the right abstractions and layers of code”.
  3. A fair part of the book is dedicated to how to properly design classes and interfaces, how to increase modularity and reuse through different techniques such as dependency injection (Dependency injection of shared context makes a class parametric and reusable), and making sure that the SSOT principle (single source of truth) is applied at all scales, from object attributes to databases to data products. the importance of the single source of truth (at all scale).

  4. Writing code that others can use mean to leverage code contracts, pre/post conditions and assertions, such as invariants. Mark Long favors comments about the why, but comment about the what (what is expected) are better translated in into actionable code (the heart of code contracts). Using code contracts is the best way to anticipate how others will use your code: “The term design by contracts was first introduced by Bertrand Meyer in the 1980s and is a central feature of the Eiffel programming language and methodology”. As someone who got introduced to software engineering by Bertrand Meyer in 1981 during a summer internship, I fully agree. Some contracts may be enforced by the compiler (with a rich type system), some other requires run-time assertions. As noticed by Mark Long : “it is often useful to think of the code we write as exposing mini API that others pieces of code can use.
  5. Another significant idea of the book – to which a few-line summary cannot do justice – is the value of immutable objects as a programming pattern. Immutable objects are easier to manage, to share, to distribute and make for code that is easier to maintain. Hence one should limit mutability and restrict its misuse through proper access methods. As the author writes: “avoid side effects or make them obvious – do not mutate input parameters unless it is totally explicit in the function name”.
  6. Last, the third part of the book covers the importance of unit testing, which would require a separate blog post. Note that unit testing was already part of the 12 principles of lean software factories more than a decade ago.




3. Software-Driven Companies

Being software-driven is a company culture trait, not (only) a technical capability. Obviously, software development excellence matters, as we developed in the introduction, but what make software driven companies excel is the recognition of the opportunities that the constant progress of software and digital transformation open. The scope of these opportunities requires a distributed approach where everyone in the company understands that software is eating the world. This is brilliantly explained in the introduction of Michael L. Littman’s – a colleague of mine a long time ago at Bellcore - book entitled “Code to Joy – Why Everyone Should Learn a Little Programming”. A rough understanding of how computer works and what they can do for us is a basic skill for humanity in the 21st century: “ Programming includes any way we can convey a desired behavior to a machine to carry out on our behalf. Coding is one particularly technical way to get the job done. But current and imagined future systems take inspiration from the ways we teach each other to do things”. This book provides a survey of a few key programming paradigms, including an interesting introduction to the different styles : imperative (telling the computer what to do), declarative (telling the computer the expected logic of the outcome), inductive (learning from examples) and conversational (learning from attempts in repeated experiences): “the four teaching styles – telling, explaining, demonstrating and inspiring – work well as a roadmap for the four main mechanisms we have today for telling machines what to do”. This book is also a good journey to understand why the quality of the conversation between “business and tech” is critical to create value in a world that is eaten by software. As the author says, “If we all start to think a little more like programmers, we can demand that the software developers expose more of the programming of the machine to us”.

While Michael Littman’s is a recent book for everyone, the book “The business of software – what every manager, programmer and entrepreneur must know to thrive and survive in good time and bad” from Michael A. Cusumano is 20 years old but still a great reference for managers – a classic recommendation from my former colleague Eric Chaniot. In this digital age, understanding the lifecycle of software products and why software must be understood as flows (dynamic view versus static – cf. the previous section) is critical for business manager.

  • First, the modern approaches of product software development demand a business buy-in. Trying to extract the most of agile software methodologies while retaining “waterfall business mindset” is both frustrating and inefficient. Agility stems from the business understanding of digital homeostasis (constant adaptation). Constant adaptation in a VUCA word implies the end of the “agency model” (where some people think and other build/ deliver) : thinking and doing are two sides of the same coin. From a digital product perspective, it means that “biz is tech and tech is biz”. And digital products are everywhere since every product and service must come with its “digital twin” API today, which I will explain further in the end of this section.
  • Second, the understanding of the software delivery cycle (CICD) and constraints is necessary to master the timing of value delivery and the growth of user-feedback-generated value. More generally, our complex and constantly evolving world requires to understand the depth of iterative approaches and the urge for regular refactoring and stepping back to perform asset management. If this mandatory hygiene is seen as a technical-only concern, technical debt (cd. Previous section) will creep in and will slow business down.
  • As was pointed out a long time ago by Jeff Bezos in his famous API memo, an API strategy is critical to deliver agility and scale at the same time and this is as much a business as it is an architectural strategy.  Software is not played in isolation, it is a “judo sport” played in ecosystems where companies must leverage the strengths of giants and understand the hidden value of open-source communities and platform (understanding that communities and platforms always go hand in hand is a good way to start). Last, as was stated in the introduction, software is the building block that links data to AI opportunities. No data strategy or AI strategy can be carried out independently from software capabilities. If a company wants to change the world (i.e. stay relevant) it must interact with its business ecosystem through APIs.

To illustrate the idea that software-awareness and AI-fluency are critical skills for a modern business manager, I will illustrate how digital twins are becoming critical tools to understand our world and plan our future ventures, in the same way that accounting, financing or user experience design have shown to be necessary. I have talked about digital twins in many blog posts, such as the review of “reinventing the product”. Here I want to point out three reasons why understanding the concept of digital twins is business-critical:
  1. The World of Tomorrow is invented in a Digital Twin. The future world, which addresses the multiple challenges of climate change, resource depletion, and the quest for sustainability, will be created through simulations with digital twins on various scales. Digital twin simulations offer significant advantages in cost and time due to their inherent customizability and ability to run parallel processes. They provide a crucial tool for navigating the complexity of global processes and innovation. Moreover, the transition to a decarbonized world presents both a formidable challenge and a significant opportunity. For instance, industries like Michelin are leveraging AI to handle the variability in recycled and bio-material inputs, enhancing process precision and adaptability through machine learning and reinforcement learning. Additionally, resilience has become a fundamental property in the 21st century, especially for supply chains and factory operations. Structural analysis alone is insufficient for rare events, necessitating large-scale digital twin simulations to prepare for crises like COVID-19, major climate events, and geopolitical tensions.
  2. Hybrid Simulation Combining AI and Simulation Enables the Exploration of Complex, Intelligent, and Adaptive System Design. Hybrid simulation, which integrates AI and traditional simulation, facilitates the exploration and design of complex, intelligent, and adaptive systems. AI enhances digital twins by modeling behaviors and solving design and performance improvement problems through exploration and optimization. For instance, Michelin's hybrid AI accelerates performance by using machine learning to converge and explore faster, predicting results and performance accurately enough to guide exploration areas. Combining data-driven models with first-principles simulations creates richer and more efficient models, as demonstrated in the debate on DeepMind's GraphCast, advocating for integration rather than opposition. Additionally, generative AI hybrids, such as foundational models, can generate 3D models and scenes from various inputs, including camera images and LIDAR captures, revolutionizing digital twin use in manufacturing by enabling the creation of detailed 3D models of machines or entire factories. The richness of this hybridization cannot be overstated: for instance, simulation may be used to generate synthetic data from which use deep-learning AI to train adaptative control algorithms for the physical robots. The real world, projected into a digital twin, generates infinite training cases from which AI produces very practical, capable and robust control algorithms to be placed back in the original physical world.
  3. Mastering Large-Scale Digital Twins is a Major Competitive Advantage for Creating the Great Systems of Tomorrow. Mastering large-scale digital twins is a significant competitive advantage for developing the major systems of the future, such as manufacturing, energy, and urban systems. Digital twins are composable and can integrate into systems of systems, necessitating considerations for interfaces (APIs), modular architecture, platforms (cf. the vision of Dassault Systèmes), and ecosystems, where collaboration with specialized partners is crucial. Modular engineering of digital twins is essential for creating large systems, with multi-scale systems of systems requiring horizontal and vertical digital continuity for enhanced digital manufacturing, which is a competitiveness issue for France. The necessity of collaboration within extended enterprises leads to seeing digital twins as collaboration platforms, aiding in understanding and sharing the workings of complex systems. 21st-century companies must adopt an ecosystem vision, where major players act as platforms, and innovative actors develop new components.

 

4. Augmented Software Engineering 

Before I dive into the topic of augmented software engineers, I need to push away the idea that because of genAI, the need for software developers will slow down rapidly, an idea that became very popular in 2023 when ChatGPT and GitHub copilot became ubiquitous. On the contrary, my belief for the decade to come (its is hard to see further) is that:

  1. The productivity improvement made possible by genAI will be absorbed the growth at which software is eating the word. This means that I see around me in the companies that I know more opportunities than what the existing capacities may deliver, and productivity improvements are necessary to match the requirements of digital homeostasis as mentioned above.
  2. Software craftsmanship will be more, not less, necessary in a world of co-development of code with the machine. Part of this comes from the preliminary findings that I will comment in this section (genAI is a great tool, but one must keep excellent control to enjoy the benefits without getting too much of the augmented risk of defect injection). The other reason, which I will address in the conclusion is more systemic – in the spirit of Kevin Kelly – distributing some of the execution to intelligent machines requires a higher level of understanding for assembly.

It is interesting to take a look at what we learned after more than a year of using genAI for software development. The following illustration is a slide taken from a lecture that I made twice last month. The structure of the slide is a crude version of the development process: from user-conversations to user-stories (design), writing new code, maintaining existing code, managing tests and delivering software products to customers.

  • GenAI does a great job to generate new code. What I see is two-folds. First, automatic completion works well because of context growth: using your existing code as the context allows GenAI tool to produce code that is very relevant. Statistics show that the acceptance rate grows from 20% to 30% as the users get more mature with the tool. This percentage has been improving regularly with new release of the genAI tools, it also grows with the experience of the developer. Second, conversational tools (chats) are getting better at generating code from a prompt, when what you want to build is standard (which is, by definition, a very frequent use case but not always true). This explains why the speed-up that you get from augmented code generation varies considerably: it is marginal on very specific and complex use cases, but it is easy to reach multiplicative factors (100% to 500% improvement) on common and repetitive work. Also, the complexity of your code changes how to use genAI. For the complex IAM simulations that I write, I find that code completion is useful but that prompting does not work yet. The Stanford study, “Do Users Write More Insecure Code with AI Assistants”, is a proof for what I was saying earlier: craftsmanship still matter since code generated from prompts is definitely not free from mistakes. As noticed in this study, developers with little experience easily become over-confident with the generated code (mimicking the false self-confidence of code generation tools).
  • As noticed in Thoughtworks podcast, writing new code is a great use case but most of the developers’ time is spent on existing code rather than creating new one (40% of the time vs 5% of the time according to one study quoted by Birgitta Böckeler). In the case of maintaining existing code, the picture is far less glorious. As explained by Adam Tornhill, when asked to help refactor existing code, the success rate of genAI is only one third, another third of the time, the code still works but is not improved and for the other third, the code that worked before is actually broken. A lot of hope as been placed on genAI to help improved the quality or the safety of code, and this not unrealistic since conversational agents have been shown to be able to make helpful suggestions or to notice some of the bad smells. In addition, genAI works well as a search tool to assist a developer to understand the overall structure of an existing code (more than understanding the structure of a complex algorithm) and it does a great job as documentalist assistant.
  •  Generating tests, especially unit tests, using AI is a classical use case that started before the introduction of genAI. The short summary is that genAI works well for simple unit tests when it is expected to translate English into code but is more limited when required to invent the logic to build a test case associated to a declarative outcome. 

  • GenAI has also shown to be promising in the earlier design phase, that is either producing user stories as requirements to developers or generating code (in a no-code tool setup). However, the positive first experiments have been mostly simple things, like simple citizen-developer utility apps or simple use cases. The jury is still out to see if that level of upstream automation will bring disruption or not. As of today, this is more a “nice to have” than a breakthrough.
  • Last, GenAI is also very useful to deliver better software products, both from the viewpoint of automating software delivery and from the capacity to provide better support (user assistance and documentation). Generating user assistants from code is a very promising direction, since the early demonstrators are really convincing. I have seen a ChatGPT plugin that was fed with the code a PowerApp application, which resulted in a user assistant that was able to explain both how and why the application worked.

 



If you agree with the previous orders of magnitude (in terms of improvement), you can see that the productivity improvement comes from two things: one the one hand, when writing new code, one may expect a “30% to 300%” significant gain based on the type of code that is being produced. For the rest of the development pipeline, automation does produce improvement (mostly, as is the case for more genAI use cases, through better assisted search) but with time savings in the 5% to 10% range. Last, the larger your company is, the more people spend their time talking to each other, for which genAI brings some improvement, but still at a marginal level. As noted by Birgitta Böckeler in a previous Thoughtwork podcast, a typical developer does not spend all her/his time in from of a keyboard. You can play with this crude model using your own assessment of the new vs old code ratio, and how complex &specific your code is versus common & framework-based, but you will usually get an end-to-end productivity improvement (how many user stories you may deliver per unit of time) between 5% to 10% ( a large company with thousands of developer reported a gain of 7%, which is consistent with this model).  You may think that this is a small number compared to the huge expectations of the hype cycle, but this is already a significant improvement. This makes software engineering as one of the major domain of genAI applications as reported by Bain : “As companies get their hands dirty with generative AI, they are reporting a small reduction in performance compared with expectations. Five use cases show signs of success: sales and sales operations, software code development, marketing, customer service, and customer onboarding. Meanwhile, use cases in legal, operations, and HR appear less successful (see Figure 3)”.

The impact of augmented software engineering is more than velocity. I suggest that you read Colin Alm’s paper “Measuring the impact of Developer Experience and GitHub Copilot” where he states that if you really want to measure its impact on the team, you have to look beyond how many suggestions were accepted (a leading indicator) and measure lagging indicators. If you use GitHub Copilot, you should see improvements in the following areas:

  • More frequent deployments/reduced cycle times.
  • Fewer build failures.
  • Improved code quality and higher test coverage.
  • Faster code review times 
  • Fewer security vulnerabilities and improved MTTR 
  • Better flow metrics 
  • Accelerated developer growth 
  • Better talent acquisition and retention 

The subjective finding of improved pleasure at work through these genAI tools matches very well my own experience when writing complex code for IAM simulation: the actual productivity gain is small but the constant “dialog” between the code producer and the copilot makes for a great experience that I would not imagine living without anymore. This is very much aligned with the ACM study about Gihub copilot impact on productivity. This study uses an productivity improvement called SPACE based on five dimensions: Satisfaction (and well-being, the pleasure of writing code), Performance, Activity, Communication and collaboration, and Efficiency. This study found that acceptance rate was better correlated with performance improvement than the actual persistence of generated code: “This suggests that a narrow focus on the correctness of suggestions would not tell the whole story for these kinds of tooling. Instead, one could view code suggestions inside an IDE to be more akin to a conversation”.

To return to the more general question of what genAI can and cannot do (i.e., to generate code versus to think about existing code), I want to emphasize the article from Megane Morane at Axios, “When AI-produced code goes bad” which you may see in the previous illustration.  You will find other interesting references such as a study from GitClear that found that AI assistants produced code hat has to be fixed a few weeks after it was authored.  This leads Megane Morane to ask if “ Generative AI tools might save time and money upfront in code creation and the eat up those savings at the other end  ?”. The short-term answer, which was the thesis of this blog post, is that software craftsmanship is required to collect the good from genAI automation without the bad. The longer-term answer is more subtle since genAI tools will get better at understanding exist code.
The fact that genAI assistant are better at generating than understanding is not an iron-clad fact, things do evolve as we enrich LLM with many more tricks such as CoT (chain of thoughts), which contribution to better code generation from problem statement is critical. I often hear that “genAI does not understand anything about semantics”, which is too strong a statement. In the process of compressing huge amounts of code into LLMs, some intermediate structures emerge that play the role of implicit semantics. Implicit means that it is hard to reason about (but not impossible, this is the role of hybrid AI, which CoT is an example of). To get a better sense of where code generation will go in the future, you may read Leopold Aschenbrenner’s essay on AGI (read the section about “Unhobbling”).

 

5. Conclusion 

 

I will conclude this post which is already long enough with a simple idea that I leave as food for thoughts. Artificial intelligence is a “complexity absorber”,  but as we distribute AI (intelligent nodes) in our factories, our business processes, our software development factories, the orchestration of the network requires more sophisticated control. This observation is true at all scale, in all the value chains operated by our companies. In the same way that smart adaptative robots require smarter control – which is precisely why we need AI-enriched simulation and digital twins as exposed in Section 3 -, generative AI assistant will automate our repetitive tasks and make our jobs both more interesting and more complex (hence the pattern: the more AI automation, the more the need for systemic craftsmanship).  We know that distributed AI is mandatory for 21st century agility (a fact that I implicitly used in this blog post but which is a central thesis of my previous book), but it comes with a price stated by complex system theory : “the coordination of smart things is more complex and challenging than the coordination of dumb things”. This is actually the basis of the management theory changes of the past thirty years, evolving from “command & control” to “recognition, response and orchestration”.

No comments:

 
Technorati Profile