Saturday, September 30, 2017

Hacking Growth, when Lean Management meets Digital Software

1. Introduction

This blogpost is both about a truly great book “Hacking Growth : How Today’s Fastest-Growing Companies Drive Breakout” by  Sean Ellis and Morgan Brown – that could be qualified as the reference textbook on Growth Hacking – and a follow-up on many previous conversations in this blog  and other posts (in French) about growth hacking. There are many ways to see Growth Hacking in relation to Lean Startup. Following the lead of Nathan Furr and Jeff Deyer, I see Growth Hacking as the third step of a journey that starts with design thinking, followed by the delivery of a successful MVP (minimum viable product) and continues to growth hacking. The goal of the design thinking is to produce the UVP (Unique Value Proposition) – it is a first iterative loop that produces prototypes. The goal of the “MVP step” is to produce a “vehicle” (a product) to learn how to deliver the UVP through feedback and iteration. The growth hacking phase is about navigating towards success and growth with this MVP.
 I like Nathan Furr idea of a “minimum awesome product”, in the sense that very crude prototypes belong to the design thinking phase (to validate ideas). Growth Hacking works with a product that is “out there” with real customers/users in the real world, and it only works with a  “minimum awesome product” that delivers the value of an awesome UVP (following Ash Maurya). This says the the classical “Learn, Build, Measure” cycle is a common pattern of the three stages: design thinking, MVP building and Growth Hacking.

This post is also a follow up to the post about “software culture and learning by doing and problem solving”. Growth Hacking is a perfect example of “lean continuous improvement culture meets agile software development practice”. It is a structured and standardized practice – in the lean sense – for extracting value from a user feedback learning loop – both in a quantitative and qualitative way. As such, it is more general than typical digital products and may be applied to a large class of software products, on closed as well as open markets. I would argue that learning from users’ feedbacks and users’ usage is a must-do from any software organization, from IT departments to ISVs.

This post is organized as follows. The next part is about Growth Hacking as a control loop. I will first recall that growth hacking is about turning customer feedback into growth thanks to the (digital) product itself (it may be applied to other products that digital, but “hackings” says that it was designed for software defined products). The third part is about the true customer-centricity of Growth hacking and the importance of the “aha moment”, when the customer experiences the UVP (Unique Value Proposition). The last part talks about the dual relationship between teams and user communities. The product is used as a mediation between a product team and a group of communicating users organized into a community.  The product plays a key role for mediating between the two: it delivers new experiences and gather new feedbacks, but there is more than numbers to Growth Hacking.

A last caveat before jumping into our topic: this book is a “user manual” of Growth Hacking for practitioners. A summary does not really work and I will highlight a few key ideas and associated quotes, rather than attempting to cover the book’s material. I urge you to read the book, especially if you are in charge of a digital product or service and trying to grow its usage.

2. The Growth Hacking Control Loop

Growth hacking is about developing market and usage growth through the product itself and a control loop centered on customer feedback. I borrow the following definition from the beginning of the book : “Growth hacking allows companies to efficiently marry powerful data analysis and technical know-how with marketing savvy, to quickly devise more promising ways to fuel growth. By rapidly testing promising ideas and evaluating them according to objective metrics, growth hacking facilitates much quicker discovery of which ideas are valuable and which should be dismissed”. What sets the software world apart is that the product is the media. This is one of the key insights of Growth Hacking (and part of the reasons for the “hacking word”). Using the product as the media to communicate with users has many benefits: it is cheap – fixed cost that is good for scaling –; it is efficient (there is a “rich bag of tricks” that the book illustrates); it is efficient since it reaches 100% of users and it is precise since software analytics enables to know exactly what works and what does not. The authors summarize this as “enabling our users to grow the product for us.”

Growth Hacking may be implemented at all scales – from a startup to a large company – and for a large range of software-defined experience, from mass-market digital goodies such as mobile apps to B2B commercial software. This is repeated many times in the book: “Nor is it just a tool for entrepreneurs; in fact, it can be implemented just as effectively at a large established company as at a small fledgling start-up”.  As told in the introduction, this makes the book valuable for most companies since “software is eating the world”: “General Electric CEO Jeffrey Immelt recently said that “every industrial company will become a software company,” and the same can be said for consumer goods companies, media companies, financial services firms, and more”. What makes the value proposition of Hacking Growth so interesting is the success track with all the Silicon Valley companies that have used this approach: Twitter, Facebook, Pinterest, Uber, LinkedIn … and many more. Even though each story is different and each “growth hack” must be tuned to the specifics of each product, there is a common method: “It wasn’t the immaculate conception of a world-changing product nor any single insight, lucky break, or stroke of genius that rocketed these companies to success. In reality, their success was driven by the methodical, rapid-fire generation and testing of new ideas for product development and marketing, and the use of data on user behavior to find the winning ideas that drove growth”.

Growth Hacking control loop is built around measure – in a classical Plan-Do-Check-Act cycle. The first message here is that Growth Hacking is grounded in data. This is very much in the lean startup spirit: decisions are based on data, and the first step of the approach is to collect the relevant data. This is especially true for large companies, as is explained in the book with a number of examples: “
Recognizing that Walmart’s greatest asset is its data, Brian Monahan, the company’s former VP of marketing, pushed forward a unification of the company’s data platforms across all divisions, one that would allow all teams, from engineering, to merchandising, to marketing, and even external agencies and suppliers, to capitalize on the data generated and collected.” A growth hacking strategy starts with a data analytics strategy.  Taking decisions based on data requires quality (for precision) and quantity (for robustness). There is clearly a “data engineering” dimension to this first step: a “data integration architecture and platform” is often needed to start the journey. The story of Facebook is a good case in point : “ in January of 2009, they took the dramatic step of stopping all growth experiments and spending one full month on just the job of improving their data tracking, collection, and pooling. Naomi Gleit, the first product manager on Facebook’s growth team, recalls that “in 2008 we were flying blind when it came to optimizing growth.” This data fuels a PDCA cycle : “ The process is a continuous cycle comprising four key steps: (1) data analysis and insight gathering; (2) idea generation; (3) experiment prioritization; and (4) running the experiments, and then circles back to the analyze step to review results and decide the next steps.” Here we see the reference to the Lean approach, or to the TQM heritage of Edward Deming.

Growth hacking is a learning process with fast cycle time. Most growth hacks do not yield positive results, so it is critical to try as many as possible. At the end, success depends on “the rapid generation and testing of ideas, and the use of rigorous metrics to evaluate—and then act on—those results”. There is a lot of emphasis on the speed (of implementation) and the rythm (of experimentation). Many examples are given to show the importance of “fast tempo”: “Implementing a method I call high-tempo testing, we began evaluating the efficacy of our experiments almost in real time. Twice a week we’d look at the results of each new experiment, see what was working and what wasn’t, and use that data to decide what changes to test next”. It really boils down to the necessity to explore a large space of optimization, without knowing in advance what will work and what will not. The authors quote Alex Schultz from Facebook : “If you’re pushing code once every two weeks and your competitor is pushing code every week, just after two months that competitor will have done 10 times as many tests as you. That competitor will have learned 10 times, an order of magnitude more about their product [than you].” To achieve this fast cycle, one must obviously leverage agile method and continuous delivery, but one must also use simple metrics, with one goal at a time. A great part of the book deals with the “North Star” metric, the simple and unique KPI that drives a set of experiment: “The North Star should be the metric that most accurately captures the core value you create for your customers. To determine what that is you must ask yourself: Which of the variables in your growth equation best represents the delivery of that must-have experience you identified for your product?”.

3. Growth Hacking is intensely customer-centric

Growth Hacking starts when the product generates a “Aha moment” for the user. The “Aha moment” happens when the user experiences the promise that was made in the UVP. The product actually solves a pain point and the user gets it. Growth Hacking cannot work if the MVP is not a “minimum awesome product” that delivers the promise of a great UVP (redundant, since U means unique). As the authors say, “
no amount of marketing and advertising—no matter how clever—can make people love a substandard product”, hence “one of the cardinal rules of growth hacking is that you must not move into the high-tempo growth experimentation push until you know your product is must-have, why it’s must-have, and to whom it is a must-have: in other words, what is its core value, to which customers, and why”. The book logically refers to the Sean Ellis ratio, and the corresponding survey that is applied to customers to find out who would be truly annoyed if the product was discontinued. Sean Ellis data mining over a very large sample of startups shows that one must reach 40% for this ratio to say that consumers “love your product” and to start scaling successfully.

Getting to a product that may deliver a “Aha moment” is the goal of the MVP cycle. This is another topic, but the book gives a few pieces of sound advice anyway. The most salient one is to stick to the “minimum” of MVP and adding features instead of focusing on simplicity: “all product developers must be keenly aware of the danger of feature creep; that is, adding more and more features that do not truly create core value and that often make products cumbersome and confusing to use.”

The ultimate goal is to make one’s product a customer habit, up to an addictive one. This is a clear consequence of the growth model that is made famous by the Pirate Metrics, once the acquisition flows, retention becomes the heart of the battle. In our world of digital abundance, retention is only won when the product becomes a habit: “
The core mission for growth teams in retaining users who are in this midterm phase is to make using a product a habit; working to create such a sense of satisfaction from the product or service that over time”. The book is full of suggestions and insights to help the reader design a product that could become a habit. For instance, it leverages the “Hook Model” proposed by Nir Eyal.  The hook model has four parts, organized into a cycle: trigger, action, reward, investment. Many of the growth hacks follow this cycle to build up a habit. The book offers a step-by-step set of examples to build triggers (based on customer journeys), to develop all kinds of rewards and to foster customers’ investment into the experience (for instance through personalization and self-customization, leading to the feeling of ownership – a key component of emotional design). As the authors notice: “ some of the most habit-forming rewards are the intangible ones. There are many kinds of rewards to experiment with in this category. There are social rewards, such as Facebook’s “Like” feature, which has been a strong driver in making the posting of photos and comments habitual.”

The autors advocate about leveraging the growing wealth of knowledge that is produced by psychology and behavioral economics. Obviously, the work of Daniel Kahneman is quoted – cf. his great book “Thinking, Fast and Slow”,  but we could also think of Dan Ariely or Richard Thaler. There are many other interesting references to other frameworks to influence customer behavior, such as Robert Cialdini’s six principles. Among those principles, the principle of reciprocity may be used to drive revenue by asking customers to make small commitments, that create a solid bond that may be leveraged later on. The principle of social proof is based on research that shows that we tend to have more trust in things that are popular or endorsed by people that we trust. Another book that is quoted here is “The Art of Choosing” by Sheena Lyengar, I strongly recommend her video here. It is interesting to understand that choice has a cost, and that some of these choices should be avoided : “Debora Viana Thompson, Rebecca Hamilton, and Roland Rust, found that companies routinely hurt long-term retention by packing too many features into a product, explaining “that choosing the number of features that maximizes initial choice results in the inclusion of too many features, potentially decreasing customer lifetime value.

4. Growth Hacking : Teams meet Communities

Growth Hacking is a team sport. The importance of teams is a common thread throughout the book. These are teams in the sense of cross-functional, agile and empowered: “the creation of a cross-functional team, or a set of teams that break down the traditional silos of marketing and product development and combine talents”. Following the lean software principles, the cross-functional team is not a group of “siloed experts”, but T-shaped profiles that bring their own skills and talents but understand each other: “You need marketers who can appreciate what it takes to actually write software and you need data scientists who can really appreciate consumer insights and understand business problems”. I borrow yet another quote on the importance of cross-functionality, since this is a key idea to leverage effectively technology into innovation, in a larger context than growth hacking: “growth hacking is a team effort, that the greatest successes come from combining programming know-how with expertise in data analytics and strong marketing experience, and very few individuals are proficient in all of these skills”.          

Growth Hacking leverages communities of communicating users.  Growth Hacking is a story with three protagonists: the team, the product and a user community – the product as the mediation between the team and the community. The importance of user community is also superbly expressed by Guy Kawasaki … and Steve Jobs. The community is the preferred tool to get deep insights from users because analytics is not enough: “
Preexisting communities to target for insight into how to achieve the aha moment can also, of course, be identified digitally”. The combination of the “aha moment” that we saw in the previous section and the community of “evangelist” is what is needed to start the growth engine: “Once you have discovered a market of avid users and your aha moment—i.e., once product/market fit has been achieved—then you can begin to build systematically on that foundation to create a high-powered, high-tempo growth machine”. The community of active, engaged, communicating users is needed to get qualitative feedback in addition to the quantitative feedback that one gets with software analytics. This deeper insight is needed to truly understand customer behavior: “it’s crucial that you never assume why users are behaving as they are; rather, you’ve always got to study hard data about their behavior and then query them on the basis of observations you’ve made in order to focus your experimentation efforts most efficiently on changes that will have the greatest potential impact”.  This fine understanding of customer behavior is necessary to eradicate friction, which is a key goal of experience design, that is remove “any annoying hindrances that prevent someone from accomplishing the action they’re trying to complete”.

When striving for growth, in the same way than one should focus on a single metric, it is better to focus on a single – or very few - distribution channel. A large part of the book demonstrates this with illustrative examples. Focusing on a distribution channel helps to narrow the diversity of customer experience and makes the iterative optimization of Growth hacking better targeted and more efficient: “Marketers commonly make the mistake of believing that diversifying efforts across a wide variety of channels is best for growth. As a result, they spread resources too thin and don’t focus enough on optimizing one or a couple of the channels likely to be most effective”.  Growth Hacking is often associated with virality. Indeed, virality is a key growth engine and, as Seth Godin explained, virality must be designed as part of the product experience: “when you do focus on instrumenting virality, it’s important that you follow the same basic principle as for building your product—you’ve got to make the experience of sharing the product with others must-have—or at least as user friendly and delightful as possible”.  However, virality is only one aspect that may be tuned by the iterative Growth Hacking optimization cycle. Acquisition and Retention come first in the customer journey and should come first in the growth hacking process.
Bottom of Form

As explained in the introduction, a summary would not do justice to this book, which is full of great illustrative examples and relevant data points and metrics.  This is definitely useful for growing mobile applications: “
For example, for mobile notifications, opt-in rates range from 80 percent at the high end, for services like ride sharing, to 39 percent at the low end for news and media offerings, according to Kahuna, a mobile messaging company”.  Growth hacking is based on building growth models that are validated, tuned or invalidated with experience cycles. The book is filled with key ratios that are more than useful to start this modeling with default values that make sense. Here is another example that is truly valuable for anyone who tries to understand her or his application retention numbers: “According to data published by mobile intelligence company Quettra, most mobile apps, for example, retain just 10 percent of their audience after one month, while the best mobile apps retain more than 60 percent of their users one month after installation”. Focusing on measure is obviously the way to go, but making sense of measures requires modelling and this book is a great help to achieve this.

5. Conclusion

Growth Hacking is the third loop of the following representation of Lean Startup, which was developed and used at AXA’s digital agency.  As explained in the introduction, the goal of the first loop is to produce the proper UVP. No one should ever start developing a product or a service without a first-class UVP – As Ash Maurya said : “life is too short to build products that people will not use”. This is hard work, but many good guides are available, such as Ash Maurya’s Running Lean. Once the UVP is crafted, there are three huge and separate challenges:

  1. To build a MVP that delivers the promise of the UVP. This is actually incredibly difficult for large organizations: there is always a short cut that seems faster (and the pressure to deliver is huge) and there are too many stakeholders that will contribute to dilute the UVP. My personal experience, from the innovation lab to the hands of the customer, over the last 10 years, is that the UVP is lost 90% of the time. As was sated earlier, Growth Hacking starts when the “aha moment” is delivered, but this is not a zero-one situation and Growth Hacking may be used to debug or improve this “aha moment”.
  2. To craft and deliver the story of the UVP to the customer. I have been amazed during the same past 10 years at the number of times a great UVP was built into a product, a service or an app, and customers were simply not aware of it. Each time you would demonstrate the experience to a customer, you would see the “aha moment” and the smile, but 1-to-1 personal demo is not a scalable method. This book precisely addresses this problem. My experience over the years has been that the crafting of this story should be codesigned with the development team. Understanding the link between the pain points, the promise and the user stories is a key factor to build a consistent and delightful experience.
  3. To help the customer, once the UVP is “in the box” and once the customer has understood what it is, so that this experience may actually be found ! This is obviously a question of user experience design and usability, but it is a tough one. Here also, Growth Hacking is more than relevant: continuous iteration is the only way to solve this problem.

Sunday, June 18, 2017

Digital Experience Factories

 1. Introduction

I have left AXA a month ago to join Michelin. This is always a great moment to reflect on some of the ambitions of the previous past years. Today I will write about Digital Experience Factories. As AXA Group Head of Digital,  I have worked on setting up a Digital Experience Factory, a software development organization geared to produce digital artefacts and experience, following my previous work on lean software factories at Bouygues Telecom. The introduction of “experience” in the name is a way to emphasize the importance of customer experience in the digital world, but this has always been a key ambition of lean software factories as is shown on the illustration in the next section. The goal of this post is to re-formulate the key ideas and principles of a Digital Experience Factory, now that I have added a few more years of experience. It should be said that the concept of software factory is now well established and that most of what looked new in 2012 is mainstream in 2017.

I have had a long history of experience and interest with software factories, agile organizations and lean software, but I really started to put the pieces together in 2012. I defined the “Lean Software Factory” as the target for our Bouygues Telecom Internet product software division (Internet gateways and set-top boxes)  by merging principles from Agile (mostly SCRUM), Extreme programming and Lean, as explained in this previous post. The theory and the background references were rich, but we actually focused on four practices only:
  • Team Problem Solving
  •  Using visual management in a project room
  •  Reducing WIP through Kanban
  • Love your code (5S for the code, coding discipline, code review, gardening, etc.)

This vision has been presented at the Lean IT Summit in 2013 and you may find the slides here with both the general principles and the four practices.  Four the French readers, a simplified presentation was made at the 4th Lean IT Summit in Lyon (2014), with the attached slides.

I will start this post with an illustration that was produced in 2012, because I have reused it extensively at AXA in a digital factory context. Although the picture was produced to illustrate our ambition with set-top boxes, it is sufficiently user-centric and generic to be widely applicable. I was happily surprised to find it so relevant five years later in a different context. I will propose a short summary in the next section.

Section 3 will focus on the critical dependency between the innovation and the software factory. I have used the major part of the past three years at setting up lean startup innovation processes. I have already touched in a previous blogpost at the importance of the relationship between the innovation and software delivery processes, but I would like to emphasize the co-dependance of these two processes which I have labelled “from customer to code” (lean startup) and “from code to customer” (devops). In the digital world (more generally in the modern complex world), “the strategy is the execution” (i.e., you are what you do).

The last section will talk about the role of digital experience factories in the world of exponential information systems.  Since “software is eating the world”, it creeps precisely everywhere in companies’ businesses, inside the company (each piece of equipment in a factory or human collaboration is becoming “smart or augmented”) and outside (customers’ digital lives or business partners). Software factories have a role to play in a larger software ecosystem, with a multiplicity of roles and stakeholders.

2. Digital Experience Factory Blueprint

The following picture is an illustration of a Digital Experience Factory – as well as a lean software factory. Although it is pretty old (in the digital time), I have found that it is still a good blue print for setting up a software factory in the digital world.

This picture is pretty much self-explanatory but I would like to point out a few things, i.e., explain the choice of a few keywords. To keep things short, let’s define the seven foundations of the Digital Experience Factory:

  1. The input for the factory is made of “pain points” & “user stories”. No one should be surprised to see user stories for an agile software shop, but I have found that they should not be separated from the “original paint points” which they are derived from. Our practice at AXA has been to build “UVP trees”, which a graphical representation that links the pain points, the UVP (unique value proposition) and the user stories. Sharing UVP with everyone in the software shop improves considerably the quality of the code (from a customer experience point of view). More generally, a key principle of a lean organization is to make sure that “the customer is represented on the production floor” and that customer testimonies – including pain points – are available (visually) to all actors of the process (not just the designers or product marketers).
  2. The output of any software development is end-user experience and is measured with user satisfaction. Customer satisfaction is the “true north” of any lean (in the Toyota Way sense) organization. Because customer satisfaction is complex (in a systemic sense), it requires an incremental approach and a feedback loop.
  3. CICD (Continuous Integration and Continuous Delivery) is the crown jewel of modern software organization. This is where the huge productivity gap resides, but this also requires significant efforts to setup. I refer you to the great Octo book, the Web Giants, which I have used extensively to evangelize and promote change in the past 5 years. CICD starts with Continuous Build and Integration and continued with Continuous Delivery using DevOps practices. This is probably the part of the picture that has evolved the most in the past 5 years since DevOps is now a mainstream critical recommendation.
  4. Test-driven development is also a critical aspect of a digital experience factory because it fuels the CICD ambition (automated tests and automated delivery go hand in hand), but also because it helps to produce higher quality code with less stress, hence more pleasure. I urge you to read Rich Sheridan wonderful book “Joy, Inc” to understand the importance of culture and pride in software development. This obviously goes back to the three tenets of self-motivation according to Daniel Pink: autonomy, mastery and purpose.
  5. Visual Management & Kanban are the most visible parts that are borrowed from the lean management principles in a Digital Experience Factory. There are two plagues in most software organizations:   rework (and its cousin, dead code) and waiting (people waiting for one another). Any software development audit that I have had to undergo in my past 15 years of professional experience has found these two issues. Visual Management, in general, and Kanban, in particular, is the best way to tackle these two problems.
  6.  “Source code is king” in a digital software factory. The new world of software is characterized by the increased rate of change and innovation. This creates two new requirements : (a) one must love one’s source code because it will be required to be looked at and changed constantly  (b) one must reuse as much existing code as possible, hence the importance of leveraging open source as a code feed (cf. the illustration).
  7. Synchronized team work : the digital experience factory is organized into squads, autonomous cross-functional teams. There are at least three important ideas here. First, Squads are cross-functional teams where all necessary skills are working together. Second, synchronized work means working together at the same time on the same problem. This creates an environment where everyone “understands a little bit of everything” which reduces informational friction considerably. Last, the squad is autonomous, both for speed and motivation.

3. Innovation Factories and Learning Loop

The following picture is borrowed from a presentation that I made at XEBICON 2015. It is the best illustration that I have of the interdependence of the innovation process and the software delivery process, and it illustrates what I have been attempting to build at AXA during the past few years.

The key point of this picture is that, although there are two processes, there is only one team and one product that is being built. Other said, these are the same people who participate to the two processes. The capability that the first process it aiming to build is to produce digital artefacts (mobile or web apps, connected object, cloud service, etc. – i.e., code) from listening to, and observing, the customer. The capability of the second process is being able to deliver to customers – at scale – a product/ service from the original code that is produced by the developer in a continuous, high frequency and high-quality manner. What I have learned over the years is that these two processes are very dependent on each other, which is, once more, a lesson from the Web Giants ! It is very difficult to run a lean software factory without the true customer centricity of a lean startup approach (cf. the importance of customer pain points, satisfaction, user stories and testimonies in the previous section). It is equally difficult to implement a lean startup approach without the performance of a great Devops software factory : the iterative process requires the high frequency of delivery, customer satisfaction demands high quality software with high performance and as few defects as possible.

As stated earlier, strategy and execution are merged in the digital world : a strategy only becomes real when it has been executed  and adapted to the “real-time” environment, and execution requires to grow and adapt the strategy continuously. Success becomes a function of the “digital situation potential”, which is a combination of skills, low technical debt and a flexible open architecture. The following illustration is borrowed from the same blog post. It shows the importance of separating different time scales:
  • T0: The immediate time scale of customer satisfaction: delivering what is requested. This is the takt time of the factory process.
  • T1: The “mid-term” time of continuous improvement. This is the kaizen time, which is more uncertain since some problems take longer to solve
  • T2: The “long-term” time of learning. In a complex word, most learning is performed “by doing”. Training happens “on the gemba”, through practice.

This picture is also a great illustration of how the double benefits of the digital experience factory agile and lean roots. Lean and Agile reinforce each other. Agile software development practices are mostly T0 (and some T1 with reflection practices) while lean emphasizes T1 (kaizen) and T2 (kaizen again ! plus dojo practices). Agile and SCRUM are born as “project development methods” whereas lean software programming is geared towards product development. I have covered this topic with more detail in my post about Lean and Architecture.  Long-term sustainable development requires architecture and it is somehow not easy to see where architecture fits in the agile framework, while architecture is a cornerstone for lean sustainable development. I also refer you to the book :  Lean Architecture – for Agile Software Development”  from James O. Coplien & Gertrud Bjørnvig.

A key piece of the Digital Experience Factory illustration is the feedback loop from customer experience (i.e., the satisfaction or the absence of satisfaction / the usage or the absence of usage). I have formalized the “Customer Feedback learning Loop” (CFLL) over the years, and our experience at AXA has helped a lot to setup best practices that may be easily reproduced.
  • CFLL is practiced with three “channels”: implicit, explicit and social.  Implicit listening means using the power of imbedded analytics to track the effective usage of customers. Explicit is “active listening” of users to hear about their usage and satisfaction. Explicit means that we look for verbatim (e.g., in the stores), testimonies from users (through interviews). Active means that this is a conversation, we may ask questions or answer to customers. Social listening requires setting communities/digital tools so that users may act as a group. Experience over the past 15 years has shown that the dynamics of feedback is very different with a group, which feels more empowered, than with individuals
  •  CFLL is managed as any quality improvement loops, using a “Toyota-style A3” which supports a PDCA (Plan-Do-Check-Act) approach. Looking for root causes using the “5 whys”, setting up kaizens with the whole team, and careful formulation of assumptions is critical, since digital troubleshooting is hard and full of counter-intuitive surprises.
  •  CFLL is part of the Growth Hacking toolbox and also leverages “source code as a marketing tool”, that is making the digital product a marketing and sales channel for itself.
  • Because social tools are useless without a community, a key task of the CFLL approach is to grow and nurture a community of engaged users. This is very well explained by Guy Kawasaki in his book “The Art of the Start” :  Success comes from fast iterations applied to a rich feedback ; there is no better way to get this rich feedback than building an “ambassador communities”.

4. Software is eating the world

I will conclude this post by stepping back and discussing about the role of the software factory in the larger software ecosystem that companies need to be part of, since “software is eating the world”. A first logical consequence of Marc Andreessen observation is that software is everywhere in the companies, with a much larger footprint than our “traditional information systems”. I use the world “software” because “digital” is ambiguous. It its broad sense, everything that using bits, digital information, is part of the “digital world”, hence software is part of it. In a narrower sense, which is used many companies, “digital” is what matters to customer, the impact of software, computers, bits … in their daily lives. Many company separate digital and IT because they use a narrower definition of digital (with the broader sense, smart factories, IOT, computer-mediated communication, information systems, web, mobile and cloud services, etc. are all part of the digital scope). To avoid that confusion, I use the word “software” as the common root for customer digital, information systems, Internet of Things, smart control of machines, and digital communication. This helps to understand that no software factory (in the digital, IS or other organizations) is an island.  It is part of multiple ecosystems, internally within the company and externally with other partners and stakeholders. In a world which is dominated by platforms,  a software factory is not only a process that produces code, it is also a host of a software environment (usually centered around a platform) and an ecosystem player through API (Application Programming Interfaces).  This also means that software factories are de facto partners with the company information systems, while at the same time it is clear that the footprint of software in companies is growing faster than their information systems. To reuse an old term from the 2000s, “shadow IT” will grow and not shrink in the future. 

There is indeed a common software ecosystem – of data models, API, architecture patterns – for each company that requires careful thinking and management, which comes from a global viewpoint. Platform engineering demands a common data exchange model (not necessarily a unique data model) as well as common engineering practices (know-how & culture) for API. In other words, “software is eating the world” but it will eat better and faster the world of your own business if you care to manage this emergent process. This is a great opportunity for information systems (IS) organization to play a “backbone role” for the various software ecosystems. Many of these software ecosystem issues are technical (software hosting and security constraints) and architectural issues (event-driven architecture, distributed data architecture) which require skills and experience that are part of information systems DNA. On the other hand, the loose coupling of platforms that are produced by autonomous teams may be a “new art” for the more traditional IS organizations. In the digital world, the platform is the team and the team is the platform: the platform is a live object that evolves continuously to adapt to its environment. A software platform is not something that you buy, not even something that you build, but something that you grow.

 I will conclude with a simple but powerful idea: Digital Experience Factories are technology accelerators, i.e. open ports for a companies to leverage the continuous flow of exponential innovation. What I mean is that Digital factories are part of exponential information systems, they make the edge (the border) of the information systems that is in contact with consumers, business partners, and innovative players. In a fast-evolving world, most companies are looking for ways to become “future-proof”. The architecture of exponential information systems draws from biology with a core that evolves slower while the frontier (membrane) evolves at a faster rate from the contact with the outside environment. Digital Factories are part of this “Fast IT” with a clear opportunity to leverage the flow of new technologies such as Artificial Intelligence, Machine Learning or Natural Language Processing. As I explained in a previous post, there are four requirements to harness these new software techniques:
  • Access to data (intelligent software drinks huge amounts of multi-sourced data)
  • Use of modern software stacks (i.e., leverage latest open sources libraries & APIs
  • Autonomous cross-functional teams
  • Lab culture (fact-based decision and iterations and failure are welcome)

One can recognize in this list the foundations of the Digital Experience Factory as explained in the second section :)

Sunday, March 5, 2017

Regulation of Emergence and Ethics of Algorithms

1. Introduction

Algorithms governance is a key topic, which is receiving more and more attention as we enter this 21st century. The rise of this complex and difficult topic is no surprise, since “software is eating the world” – i.e., the part of our lives that is impacted by algorithms is constantly growing – and since software is “getting smarter” every year, with the intensification of techniques such as Machine Learning or Artificial intelligence. The governance question is also made more acute since smarter algorithms are achieved through more emergence, serendipity and weakening of control, following the legendary insight of Kevin Kelly in his 1995 “Out of control” best seller: “ « Investing machines with the ability to adapt on their own, to evolve in their own directions, and grow without human oversight is the next great advance in technology. Giving machines freedom is the only way we can have intelligent control. » Last, the algorithmic governance issue has become a public policy topic since Tim O’Reilly coined the term “Algorithmic Regulation” to designate the use of algorithms for taking decision in public policy matters.

Algorithm governance is a complex topic that may be addressed from multiple angles. Today I will start from the report written by Ilarion Pavel and Jacques Serris “Modalities for regating content management algorithms”. This report was written at the request of Axelle Lemaire and focuses mostly on web advertising and recommendation algorithms. Content management – i.e. deciding dynamically which content to display in front of a web visitor – is one of the most automatized and optimized domain of the internet. Consequently, web search and content recommendation are domains where big data, machine learning and “smart algorithms” have been deployed at scale. Although the report is focused on content management algorithms, it takes a broad view of the topic and includes a fair amount of educational material about algorithms and machine learning.  Thus, this report addresses a large number of algorithm governance issues. It includes five recommendations about algorithm regulation intended for public governance stakeholders with the common intent of more transparency and control for algorithms that are developed in the private sector.

This short blog post is organized as follows. The first part provides a very simplified summary of the key recommendations and the main contribution of this report. I will focus on a few major ideas which I found quite interesting and thought-provoking. This report addresses some of the concerns that occur from the use of machine learning and artificial intelligence in mass-market services. The second part is a reply from the angle of our NATF work group on Big Data. As was previously explained, I find that we have entered a “new world” for algorithms that could be described as “data is the new code”. This cast a different shadow on some of the recommendations from the Ilarion Pavel & Jacques Serris report. As algorithms become grown from data sets through training protocols, it becomes more realistic to audit the process than the result. The last part of this post talks about the governance of emergence, or how to escape what could be seen as an oxymoron. The question could be stated as “is there a way to control and regulate something that we do not fully understand ?”. As a citizen, one expects a positive answer. Other sciences have learned to cope with this question a long time ago, since only computer scientists from Silicon Valley believe that we may control and fully understand life today (these issues arise constantly in the worlds of medicine, protein design or cellular biology for instance). But the existence of this positive answer for Artificial Intelligence is a topic for debate, as illustrated by Nick Bostrom’s book “Superintelligence – Paths, Dangers, Strategies”. To dive deeper into this topic, I strongly recommend the reading of "Code-Dependent : Pros and Cons of the Algorithmic Age" by Lee Rainee and Janna Anderson

2. Algorithm Regulation

First, I should start with my usual caveat that you should read the report versus this very simplified and partial summary. The five recommendations can be summarized as follows:

  • Design a software platform to facilitate the study, the evaluation, and the testing of content / recommendation algorithms in a private/public collaboration opened to research scientists
  • Create an algorithm audit capability for public government
  • Mandate private companies to communicate about algorithm behavior to their customers, through a “chief algorithm officer role”
  • Start a domain-specific consultation process with private/public stakeholders to formalize what these “smart content management services” are and which best practices should be promoted nationally or internationally.
  • Better train public servants who use algorithms to deliver their services to citizens

A fair amount of the report talks about Machine Learning and Artificial Intelligence, and the new questions that these techniques raise from an algorithm ethic point of view. The question “how does one know what the algorithm is doing” is getting harder to answer than in the past. On page 16, the concept of “loyalty” (is the algorithm true to its stated purpose ?) is introduced and leads to an interesting debate (cf. the classical debate about the filter bubble). The authors argue – rightfully – that with the current AI & ML techniques the intent is still easy to state and to audit (for instance because we are still mostly in the era of supervised learning), but it is also clear that this may change in the future.  A key idea that is briefly evoked on page 19 is that machine learning algorithms should be evaluated as a process, not on their results. Failure to do so is what triggered the drama of the Microsoft chatbot who was made non-loyal (not to say racist and fascist) through a set of unforeseen bet perfectly predictable interactions. One could say there is the equivalent of Ashley’s law of requisite variety in the sense that the testing protocol should exhibit a complexity commensurate to the desired outcome of the algorithm. Designing training protocols and data sets for algorithms that are built from ML to guarantee the robustness of their loyalty is indeed a complex research topic that justifies the first recommendation.

We hear a lot of conflicting opinions about the threat of missing the train of AI development in Europe or in France, compared to the US or China. The topic is amplified by the huge amount of hype around AI and the enormous investments made in the last few years, while at the same time there seems to be a “race to open source” from the most notorious players. The authors propose three scenarios of AI development. In the first scenario, the current trend of sharing dominates and produces “algorithms as a commodity”. AI becomes a common and unified technology, such as compilers. Everyone uses them, but differentiation occurs elsewhere. The second scenario is the opposite where a few dominant players master the smart systems (data and algorithms) at a skill and scale level that produces a unique advantage. The third scenario focuses on data ecosystems but recognizes that the richness and regulatory complexity of data collection make it more likely to see a large number of “data silos” emerge (larger number of locally dominant players, where the value is derived more from the data than the AI & ML technology itself). As will become clear in the rest of this blog, I see the future as the combination of 2 and 3 : massive concentration for a few topics (cf. Google and Facebook) that coexists with a variety of data ecosystems (if software is eating the world and tomorrow’s software is derived from data, this is too much to chew for a single player, even with Google’s span).

A key principle proposed by the authors is to “embody” the algorithm intent through the role of “chief algorithm officer”, with the implicit idea that (a) algorithms have no will or intent of their own, that there is always a human behind the code (b) companies should have someone who understands what the algorithm does and is able to explain it to stakeholders, from customer to regulators. The report makes a convincing case that “writing code that works is not enough”, the of “chief algorithm officer” should be able to talk about it (say what it does) and prove that it works (does what is intended). There is no proof, on the other hand, that this is feasible, which is why the topic of algorithm ethics is so interesting. The authors recognize on page 36 that auditing algorithms to “understand how they work” is not scalable. It requires too much effort, will prove to be harder and harder as techniques evolve, and we might expect some undecidability theorems to hit along the way. What is required is a relaxed (weaker) mandate for algorithm regulation and auditing: to be able to audit the intent, the principles that guarantee that the intent is not lost, and the quality of the testing process. This is already a formidable challenge.

3. Data is the New Code

This tagline means that the old separation between data and code is blurring away. The code is no longer written separately following the great thinking of the chief algorithm officer and then applied to data. The code is the result of a process – a combination of machine learning and human learning – that is fed by the available data. “Data is the new code” was introduced in our NATF report to represent the fact that when Google values software assets for acquisition, it’s the quantity and quality of collected data that gives the basis for valuation. The code may be seen as the by-product of the data and the training process. There is a lot of value and practical expertise with this training process, which is why I do not subscribe to the previously mentioned scenario of “AI as a commodity”. Smart systems is first and foremost an engineering skill.

A first consequence is that the separation of the Chief Data Officer from the Chief Algorithm Officer is questionable. The code that implements algorithms is no longer static, it is the result of an adaptive process. Data and algorithms live in the same world, with the same team. It is hard to evaluate / audit / understand / assess the ethical behavior of data collection or algorithms if the auditor separates one from the other. Data collection needs to be evaluated with respect to the intent and the processes that are run (which has always been the position of the CNIL) and algorithms are – more and more, this is a gradual shift – the byproduct of the data that is collected.

Data ethics is also very closely related to algorithm ethics. On page 29, the report tells that bias in data collection produces bias in the algorithms output. This is true, and the more complex the inference from data, the more complex tracking these biases may be. The questions about the ethics of data collection, the quality and the fidelity of the data samples, are bound to become increasingly prevalent. As explained before, this is not a case where one can separate the data collection from the usage. To understand fairness – the absence of biases - , the complete system must be tested. Serge Abiteboul mentioned in one of his lectures the case of Staples, whose pricing mechanism, through a smart adaptive algorithm, was found to be unfair to poorer neighborhood (because the algorithm “discovered” that you could charge higher prices when there are fewer competitors around). I recommend reading the article “Discovering Unwarranted Associations in Data-Driven Applications with the FairTest Testing Toolkit” to see what a testing protocol / platform for algorithm fairness could look like (in the spirit of the first recommendation of the report). The concept of purpose is not enough to guarantee an ethical treatment of data, since many experiments show that big data mining techniques are able to “find private pieces of data from public ones”, to evaluate features that we not supposed to be collected (no opt-in, regulated topics) from data that were either “harmless” or properly collected with an opt-in. Although the true efficiency of the algorithms of “Cambridge Analytica” are still under debate, this is precisely the method that they propose to derive meaning full data traits from those that can be collected publicly.

The authors of the report are well aware of the rising importance of emergence in algorithm design. On page 4, they write “one grows these algorithms more than one writes them”. I could not agree more, which is why I find the fourth recommendation surprising – it sounds too much of a top-down approach where data services are drawn from analysis and committees versus a bottom-up approach where data services emerge from usage and collected data. In the framework of emergent algorithm design, what needs to be audited is no longer the code (inside of the box which is becoming more of a black box) but the emergence controlling factors and the results:
  • Input data
  • Purpose (intent) of the algorithm
  •  “training” / “growing” protocol
  •  Output data

This brings us to our last section:  how can one control the system (delivering a “smart” experience to a customer) without controlling the “black box” (how the algorithm works) ?

4. How to Control Emergence ?

The third recommendation tells about the need to communicate about the way algorithms operate. Following the previous decomposition, I favor the recommendation on communicating about intent, with the associate capability (recommendation #2) to audit the loyalty (the algorithm does what its purpose says). On the other hand, I do not take this literally to explaining how the algorithm works. This was perfectly achievable in the past, but emergent algorithm design will make it more difficult. As explained earlier, there are many reasons to believe that it may simply be impossible from a scientific / decidability theory view point.

This is still a slightly theoretical question as of today, but we are coming fast to a point when we will truly no longer understand the solutions that are proposed by the algorithms. Because AlphaGo is using reinforcement learning, it has been able to synthetize strategies that may be qualified as deceiving or hiding its intent to the opponent player. But humans are very good at understanding Go strategies. In the case of the recent win of AI in poker tournaments, it is trickier since we humans have a more difficult time at understanding randomized strategies. We have known this from game theory and Nash equilibriums for a long time. Pure strategies are easier to understand but mixed strategies are often the winning ones. Some commentators assess that the domination of the machine over human is even more impressive for Poker than for Go, which to me reflects the superiority of the machine to handle mixed (i.e. randomized) strategies. As we start mixing artificial intelligence with game theory, we will grow algorithms that are difficult to explain (i.e., we will explain the input, the output, the intent and the protocol, not what the algorithm does). If one only uses a single AI or machine learning technique, such as deep learning, it is possible to still feel “in control” of what the machine does. But when a mix of techniques is used, such as evolutionary game theory, generative AI, combinatorial optimization and Monte-Carlo simulation, it become much less clear. As a practitioner of GTES (Game Theoretical Evolutionary Simulation) for a decade, it is very clear that the next 10 years of Moore Law will produce “smart algorithms” with deep insights from game theory that will make them able to interact with their environment – that is, us – in uncanny ways.

I have used the “backbox” metaphor because a systemic approach to control “smart algorithm” is containment, that is isolate them as a subsystem in a “box of constraints”. This is how we handle most of the other dangerous materials, from viruses to radioactive materials. This is far from easy from a software perspective, but there is no proof that it is impossible either. Containment starts with designing interfaces, to ensure what the algorithm has access to, and what outcome/ suggestions it may produce. The experience of complex system engineering shows that containment is not sufficient, because of the nature of complex interaction that may appear, but it is still a mandatory foundation for safe system design. It is not sufficient for practical reasons: the level of containment that is necessary for safety is often in contradiction with the usefulness of the component. Think of a truly great “strong AI” in a battery powered box with no network connection and a small set of buttons and lights as an interface. The danger of this “superintelligence” is contained, but it is not really useful either. The fact that safety may not come solely from containment is the reason we need complex / systemic testing protocols, as explained earlier.
Another possible direction is to “weave” properties into the code of the emergent algorithm. It is indeed possible to impose simple properties onto complex algorithms, that may be proven formally. 

The paradox is that there are simple properties of programs, such as termination, which are undecidable, while at the same time, using techniques such as abstract interpretation or model checking, we may formally prove properties about the outputs. For my more technical readers, one could imagining weaving the purpose of the algorithm using aspect-oriented programming into a framework that is grown through machine learning. This is the implicit assumption of the scifi movies about Asimov’s laws that are “coded into the robots” : they must be either “weaved” into the smart brain of the robot or added as a controlling supervisor – precisely the containment approach, which is always what gets broken in the movie. The idea of being able to weave “declarative properties” – that capture the intent of the algorithm and may be audited – into a mesh of code that is grown from data analysis is a way to reconcile the ambition of the Ilarion Pavel and Jacques Serris report with the reality of emergent design. This is a new field to create and develop, in parallel with the development of AI and machine learning in software that is eating the world. This will not happen without regulation and pressure from the public opinion.

These are not theoretical considerations because the need to control emergent design is happening very soon. Some of these concerns are pushed away by creating divides: “weak AI” that would be well controlled versus “strong AI” that is dangerous but still a dream, “supervised machine learning” that is by definition under control, versus “unsupervised learning” which is still a laboratory reseach topic. The reality is very different: these are not hard boundaries, there is a gradual shift day after day when we benefit from more computing power and more data to experiment with new techniques. Designing methods to control emergence requires humility (about what we do not know) and paranoia (because bad usage of emergence without control or foresight will happen).

Technorati Profile