Friday, February 21, 2014

Complex systems, Scale-free networks and Affiliation networks



This post is a follow-up of the previous one about complex systems. I will further focus on the fifth point which said that efficiency in a complex system is strongly related to the capability to support information exchange flows, and will expand on the importance of Scale-free networks . The importance and the size of information flows are directly related to the complexity of such systems, which is the amount of interaction between the components.
What makes a good information flow network for a complex system? Here are five characteristics that spring to mind:
  • Latency : to minimize the time it takes from some information to travel from one subsystem to the other.
  • Throughput : to simultaneously transmit large amounts of information between all subsystems.
  • Resilience: to continue functioning when some subsystems or some links become unavailable.  These three characteristics are universal for all complex systems, for instance information systems or enterprise communication channels.
  • Searchability: to ease the task of finding related subsystem through the exploration of the communication network. This property is related to dynamic growth and self-organization. In an autonomic system, automatic discovery of new features and new component is at the heart of the system’s dynamic organization.
  • Cost: to minimize the total weight of the communication network, whether we talk about energy, mass or dollars. Related to cost is the scalability of the communication network structure. Scalability means that the structure may easily evolve as the complex system grows.

To reduce latency, one must reduce the diameter of the network, which is (roughly) the average path length. The easier way is to add additional links. Similarly, to increase throughput and resilience, one must rely on path redundancy (the fact that many paths exist for routing one flow). However, there is a double trade-off: increasing the average degree and the number of edges both increases the cost and reduces searchability.

Nature seems to have found the perfect solution for this trade-off with the scale-free network structure. A scale-free network is a graph whose degree distribution follows a power law. Compared to a random graph, this means that there is a higher frequency of highly connected nodes, with large degrees. Scale-free have many wonderful properties, as explained by Duncan Watts or Albert-Laszlo Barabasi. Their diameter is logarithmic in their size, and they are very resilient, that is their level of connectivity is weakly changed when some nodes become unavailable. The name “scale-free” comes from the self-similarity that the degree distribution implies. Somehow, a scale-free network may be seen as a “fractal structure”, which makes it an interesting candidate for self-growth and self-organization.

What has been found in the past 20 years is that scale-free networks are everywhere, both in the nature-made complex systems (such as the network of chemical reactions within the brain), and in the human-grown systems that incorporate feedback and learning, such as the Web (network of pages) or the Internet (network of computers). Let me quote the introduction of « Scale-Free Networks : A Decade and Beyond » from Albert-Laszlo Barabasi : For decades, we tacitly assumed that the components of such complex systems as the cell, the society, or the Internet are randomly wired together. In the past decade, an avalanche of research has shown that many real networks, independent of their age, function, and scope, converge to similar architectures, a universality that allowed researchers from different disciplines to embrace network theory as a common paradigm.” As with any general big idea, this is an approximation of the real world, and there are some debates whether real networks have an exact power law for their degree distribution. Still, it is both a useful and powerful concept, when trying to design communication networks.



I will now write a brief summary of « Linked », a great book by Albert-Laszlo Barabasi. I have read this book many years ago, and promised to give a review in my other blog, but never found the time to do it. Still it is very relevant to what I just wrote (together with many other books which I have selected in this post ) since it contains a lot of details and examples about the importance of scale-free networks. The following is a short list of relevant key ideas that are well illustrated in this book, with no claim of completeness:
  • The book starts with the concepts of diameter and average path length. Throughout the book, many examples are given of really large networks with small diameters. For instance, the Web  (URL network) diameter is 19. Another interesting example is the molecule interaction network in a living cell, through chemical reactions. The “diameter” is only 3 (three degrees of separation). Lately we have learned that the diameter of Facebook social graph is 4.7.
  • By looking more closely at these networks, we see that the short diameter is not due to the number of edges but the presence of “connectors” (hubs), as defined by Malcom Gladwell in “The Tipping PointJ This is true for cell reactions, where a few molecules interact with many others. Small-world networks, as defined by Watts and Strogatz, also exhibit a higher clustering coefficient than random graphs. These small-world structures may be thought of as small tightly connected groups, linked by connectors – hubs with high degrees.
  • This leads to the concept of scale-free networks, by looking at the node degree distribution law. The presence of connectors is the result of power laws, which are also called “fat tailed” because the number of nodes with very high degree is much higher than a typical “exponential decay” law. Another interesting example of scale-free networks is the graph of word co-occurrence in natural language.
  • A good part of the book deals with how scale-free network may be grown, that is how they emerge in real life. This leads to the powerful “rich get richer” paradigm (also called the Matthew Effect), where the probability of creating a new edge is proportional to the existing degree. Growth is a signature of Scale-free networks. I quote from the book : “The power laws emerge – nature’s unmistakable sign that chaos is departing in favor of order. The theory of phase transitions told us loud and clear that the road from disorder to order is maintained by the powerful forces of self-organization and is paved by power laws”.
  • A very interesting part of the book deals with resilience, with examples drawn from biology such as the protein network in our metabolism. There is an interesting comparison with hierarchical networks (such as organizational charts in a traditional company or electricity distribution network) which are less fault-tolerant than scale-free networks (even with added redundancy for the high value links). Another quote: “The coexistence of robustness and vulnerability plays a key role in understanding the behavior of most complex systems. Simulations have shown that the protein network refuses to break apart under randomly generated mutations.”
  • Scale-free networks are graphs, with edges between two nodes that only describe binary interactions. Most of real world complex systems use more complex “n-ary” interactions, which could be described with hypergraphs, two-mode networks or affiliation networks. For instance, the meetings between coworkers in a company or the chemical reaction networks are hypergraphs. A meeting is an hyper-edge since it binds many participants; a chemical reaction is also an hyper-edge in the molecule graph. It is easy to model an affiliation network with a regular bi-partite graph (just add a few nodes for the hyper-edges), so this is not a big technical difference, but more and more interest is given to affiliation networks since they are very common in the real world of complex systems.

Five years ago I decided to see if Duncan Watts’s results would also apply to Affiliation Networks. I wrote a paper entitled “Efficiency of Meetings as a Communication Channel : A Social Network Analysis” which I presented at the “Management and Social Networks” conference in Geneva (2012). The main findings may be described as follows:
  • I have shown that the most efficient meeting network structure relies on small meetings that have a high frequency. There is no surprise here, since this is a tenet of agile companies which are organized around daily short team meetings. Still, it is interesting to see that this is a deep structural property of the underlying network.
  • I have proposed a “latency performance indicator” that predicts the speed of information propagation as  “ #of-monthly-meetings * log(#people-that-one-wants-to-communicate with) / log(#people-that-one-actually-meets-in-a-month)”. For those mathematically inclined, one may retrieve the best practices (fewer meeting, frequent meetings, a few large meetings) within the formula.
  • The most interesting piece is the emergence of a small-world structure as the most efficient meeting network, which is a hybrid combination of small team meetings and a few larger meetings.  This reproduces, in the case of an affiliation network, the results found ten years ago by Duncan Watts. It tells that companies should reproduce the diversity found in nature, implement path redundancy and combine many really small and frequent meetings such as SCRUM stand-up meetings together with a few overlapping “town-hall” meetings (large audiences).

Scale-free networks are similar to what sociology calls “ambidextrous organizations”.  Ambidextrous organizations leverage the power of cliques and the strength of weak ties. The “power of cliques” is precisely the strength of team work, small group of people that are all connected to one another (hence the clique name), establishing “strong ties” (which means frequent in the world of social network science). The “strength of weak ties” is the law established by Mark Granovetter that says that we need to use our “weak ties”/extended network to get out from difficult or exceptional situations. “Weak ties” refer to people that we see rarely (as opposed to strong ties) ; the “weak ties” make the edge of our social network, they provide the diversity of viewpoint and culture which is often absent from the core of our social network (since “strong ties” tend to be very similar to ourselves).


The idea that complex systems sciences in general and social network structures in particular, are relevant to enterprise organization is becoming more and more popular (this is precisely the topic of my other blog).  I will conclude with three examples which are closely related to this post, since those three theories attempt to improve management efficiency through a better-designed information network:

  • Sociocracy  uses circles as a team structure, and doublelinks (each intersection between circles is represented by two individuals) to implement redundant information propagation paths. The illustration is taken from Wikipedia.
  • BetaCodex is a management theory and practice whose claim is to “organize for complexity”. It is based on a cellular network structure, which draws its organizing principles from biology. The tree structure is replaced by a denser network of circles (with a clear reference to sociocracy), providing shorter and more resilient information propagation paths.
  • Holacracy is another recent management theory that draws on complex system theory. Here again we find a system of self-organizing circles (with a similar influence from sociocracy). The most defining feature of holacracy is “to organize around purpose” (cf. the fourth principle of our previous list).



Tuesday, December 31, 2013

Seven Keys for Complex Systems Engineering



I gave a talk early this year at the “IRT SystemX” inauguration, about the challenges that occur when engineering “Systems of Systems”. This talk is a quick introduction of what we can learn from complex systems when designing large-scale interactive industrial systems. Complex systems are defined by their goals (purpose) and a set of sub-systems with rich interactions. The complexity of these interactions yields the concept of emergent behavior. Complex systems have a fractal nature, that is, they exhibit multiple scales, both from a physical/descriptive level and from a temporal level. Complex systems embed memory and have the capability to learn, which makes them both dynamic and adaptive systems. They interact constantly with their environment, which means that a dynamic vision of flows is more relevant than a static description of their top-down decomposition. Most complex systems renew their low-level components in a continuous process. Teleonomy and process analysis are, therefore, the most useful approach to capture the essence of a complex system.

I have become gradually fascinated by the topic of complex systems because I find it everywhere in my job and my own research. Complex systems is the right framework to understand the management and the organization of modern enterprises. This is the topic of my other blog. All that is said about complex systems in the previous paragraph applies to a company. I also found that this applies to information systems as well. The main reason for creating this blog was the realization that the proper control for information system has to be emergent, following the lead of Kevin Kelly and the intuition behind Autonomic Computing. Last, complex systems are everywhere when one tries to understand the most common business ecosystems, such as smartphone application development, smart homes or smart grids. I have talked about Smart Grids Players as a Complex System in this blog. More examples may be found in my keynote at CSDM 2012.

There is a paradox with the popularity of “complex systems science” in today’s business culture. On the one hand, the importance of complex systems’ concepts is obvious everywhere: system of systems, enterprises, markets. On the other hand, the practical insights are not so clear. « System thinking » has become a buzzword and the word “complexity” is everywhere … still many textbooks and articles which claim to apply “the latest of complex science theory” to business and management problems are either obscure or shallow. This is not to say that there does not exist a wealth of knowledge and practical insights that is available in complex systems literature. On the contrary, the following is a selection of some of the books which I have found useful during the last few years.



Today’s post is a crude and preliminary attempt to pick seven keys that I have found in these books which, to me at least, are practical in the sense that they unlock some of the complexity – or mystery – of the practical complex systems which I have encountered. There is no claim of completeness or rigorous selection. This is clearly a personal and subjective list which I consider a « work in progress ». This is just a list, so I will not develop each of the seven keys here, although each would deserve a blog post of  its own.

  1. Complexity means that forecasting is at best extremely slippery and difficult, and most often outright impossible. This is, for instance, the key lesson from Nassim Taleb’s books, such as The Black Swan. The non-linearity of complex system interactions causes the famed butterfly effect, in all kinds of disciplines. If you line up a series of queues, such as in the Beer Game supply chain example, each queue amplifies the variations produced by the previous one and the result is very hard to forecast, hence to control (this depends, obviously, of the system load). This does not mean that simulation of complex systems is useless, it means that is must be used for training as opposed to forecasting. Following Sun Tzu or François Jullien, one must practice “serious games” (such as war games) to learn about complex system from experience. This complexity also means that one needs as much data as possible to understand what is happening, and should beware of simplified/abstract description. “God is in the detail” has become a very popular business idiom in the last decades.

  2. Complex systems most often live in a complex environment which makes homeostasis an (increasingly) complex feast of change management. Homeostasis describes the process through which a complex system continuously adapts to its changing environment. The characteristic of successful complex systems, in a business context, is the ability to react quickly, with a large range of possible reactions. This applies both at the level of what the system does and what it is capable of doing. This is illustrated by the rise of the word “agility” in the business vocabulary. The law of requisite variety tells us why detailed perception is crucial for a complex system (which is clearly exemplified by recent robots) : the system’s representation of the environment should be as detailed/varied as the sub-space from the outside environment that the homeostatis process needs to react to.

  3. Complex systems, because of the non-linear interaction in general, and because its components have both memory and the capability to learn, exhibit statistical behaviors which are quite different from “classical” (Gaussian) distribution. This is one of the most fascinating insights from complex systems theory: fat tails (power laws) are the signature of intelligent behavior (such as learning).   In classical physics or statistics, all individual events are (most often) assumed to be independent, which yields the law of large numbers and Gaussian distributions. But when the individual events are caused by actors who can learn or influence each other, this is no longer true. Rather than the obvious reference to Nassim Taleb, the best book I have read on this is The Physics of Wall Street. This works both ways: it warns us that “black swans” should be expected from complex systems, but also tells us that some form of coordinated behavior is probably at work when we observe a fat tail. There is another interesting consequence : small may be beautiful with complex systems, if adding many similar sub-systems creates un-foreseen complexity ! Classical statistics is all in favor of large scale and centralization (reduction of variability) whereas complex behavior may be better understood with a de-centralized approach. This is precisely one of the most interesting debates about the smart grids : if there is no feedback, learning and user behavior change, the linear nature of electricity consumption favors centralization (and large networks); if the opposite is true, a system of system approach may be the best one.

  4. Resilience in complex systems often comes from the distribution of the whole system purpose to each of its subcomponents. This is another great insight from complex system theory: control needs to be not only distributed (to sub-systems) but also declarative, that is, the system’s purpose is distributed and the control (deriving the action from the purpose) is done “locally” (at the sub-system level). This idea of embedding the whole system’s purpose into each component is often referred as the holographic principle, with a nice hologram metaphor (in each piece of a hologram, there is a “picture” of the whole object). This principle has been proven many times experimentally with information systems’ design: it has produced “policy-based control”, where the goals/SLA/purposes are distributed in a declarative form (hence the word “policy”) to all sub-components. I gave the example of SlapOS in my IRT talk as a great illustration of this principle. This is also closely related to the need for fast reaction in the homeostasis process: agility requires distribution of control, with a bottom-up / networked organization similar to living organisms (for most critical functions). One of my favorite books which apply this to the world of enterprise organization is “Managing the Evolving Corporation” by Langdon Morris.

  5. Efficiency in a complex system is strongly related to the capability to support information exchange flows. There is a wealth of information about the structure of information networks that best support these flows. Scale-free networks, for instance, occur in many complex systems, ranging from the Web to the molecular interactions in living cells and including social networks. Scale-free networks reduce the average diameter, among other interesting properties, and can be linked to avoiding long paths in communication chains, both for agility and resilience. The challenge that these information flows produce is represented by the product of the interaction richness (essence of complexity in a complex system) and the high frequency of these interaction (our key #2) – the product of two large numbers being an even larger number.  My other blog is dedicated to the idea that managing the information flows is the most critical management challenge for the 21st century (an idea borrowed from “Organizations” by March & Simon). For instance, the necessity to avoid long paths translates into versatility : complexity prevents specialization, because too much specialization generates even more synchronization flows. This communication challenge is not simply about capabilities (“the size of the communication pipes”), it is also about semantics and meaning. A common vocabulary is essential to most “systems of systems”, whether they are industrial systems or companies.

  6. Complexity in time is something that is difficult to appreciate for humans. One of the most critical aspect of complex systems are the loops, mostly feedback loops. Peter Senge and John Sterman have written famous books about this.  Reinforcement and stabilizing loops are what matter the most when trying to describe a complex system, precisely because of their non-linear natures. The combination of loops, memory and delays cause surprises to human observers. John Sterman gives many examples of overshooting, which happen when human over-react because of the delay. Kevin Kelly gives similar examples related to the management of wildlife ecosystem. The lesson from nature is a lesson of humility : we are not good at understanding delays and their systemic effects in a loop. In the world of business, we have a difficulty to understand long-term consequences of our actions, or simply to visualize long-term equilibriums. Many people think that user market share and sales market share should converge, given enough years, without seeing the bigger picture and the influence of attrition rate (churn). Even simple laws such a Little’s Law may produce counter-intuitive behaviors.
  7. Efficient control for complex systems is an emergent property. Control strategies must be grown and learned, in a bottom-up approach as opposed to a top-down design. We are back to autonomous computing : top-down or centralized control does not work. It may be seen as another consequence of Ross Ashby law of requisite variety: complete control is simply impossible. Adaptive control required autonomy and leaning. This is, according to me, the key insight from Kevin Kelly’s book, Out of Control : “« Investing machines with the ability to adapt on their own, to evolve in their own directions, and grow without human oversight is the next great advance in technology. Giving machines freedom is the only way we can have intelligent control ».  This insight is closely related to our key #4 : autonomy and learning transform progressively distributed policies into emergent control.  There exists another corollary from this principle: such policies, or rules, should be simple, and the more complex the system, the simpler the rules. One could say that this is nothing more than the old idiom KISS, a battlefield lesson from engineering lore. But there is more to it, there seems to be a systemic law that is comforted by business experience: only simple explicit rules provide long-term values to complex systems. Any rule that is complex has to be implicit, that is constantly challenged and re-learned.


Sunday, October 20, 2013

Lean Startup & Lean Innovation Factory


I had the privilege to attend the Lean IT Summit in Paris a week ago, and was pleased to hear “The Lean Startup” mentioned in almost half of the talks. Actually, the Lean Startup is so popular that some are getting annoyed :) I co-wrote the preface of the French edition because I am a strong believer in the principles that Eric Ries explains in his book. However, with popularity comes exaggeration and re-interpretation. Here are two things I heard during the lean IT summit that got me annoyed as well:
  • The Lean Startup is what the lean community has expressed for a long time, with better words. Kudos to Eric Ries for being such a great communicator !
  • The Lean Startup is a lean reformulation of well-known innovation practices. Actually, innovation is in the genes of lean manufacturing, so no surprise there !
I disagree on both accounts:
  • The Lean Startup is not a book about lean, it’s a book about innovation, mostly startups but which is also relevant for larger companies, which is why I am such a strong advocate. After writing part of the preface, I ordered many dozens of the book which I have distributed freely in my own company. Sure, the lean framework gives a lot of sense to the overall contribution, but this is not the point.
  • Although many of the key ideas have been around for a while, the combination of these principles into a well-defined innovation process is a true contribution. It definitely goes against what most people believed to be innovation in larger companies. I had heard Eric Ries’s ideas expressed by a few VC from Silicon Valley, but they were anything but mainstream.
Hence this short post is about two things. The first part is a “Lean Startup for dummies” summary. It is by no means thorough nor complete, my French post from two years ago did a better job, but it is written for the corporate world and emphasizes what may be seen as “different”, at least compared with how “innovation” was described ten years ago, when we talked about “ideation factories”. The second part describes what I call “Lean Innovation Factory”, that is the application of Lean Startup principles to the innovation division of a large company.

1. Lean Startup for dummies

Eric Ries’s book deserves to be read because it is filled with meaningful examples. Therefore, a short summary cannot do justice to its content. Here I will only pick three key principles:

(a)    Innovation is about doing, not about producing ideas

This principle is very similar to what the pretotyping  manifesto  promotes. The prototyping manifesto gave us these mottos:   innovators beat ideas, pretotypes beat productypes, building beat talking… which all tell that the key part in innovation is the doing. This is especially true in the digital world, and is acknowledged by similar mottos from Google (“Focus on the user and all else will follow”, “Fast is better than slow”) or Facebook (“code wins”, ”done is better than perfect”). To innovate means, most of the times and above everything else, to meet a customer problem and to remove a pain point. Value creation occurs at the contact with the customer, not in a brainstorming room. This does not mean that ideation tools and techniques are not useful or important; it means that only “on the gemba” can we check that innovation actually works. 

This is more revolutionary than it may sound for larger and older companies, which have associated the “innovation” word with “great ideas”. I have in my library dozens of book about innovation that distinguish between all kinds of innovation (according to the source of the “newness”) and that propose many processes for reaching all kinds of customers. The beauty of the lean startup framework is to simplify – so to speak, since value-creation-at-the-hands-of-the-customer is indeed hard – and to get rid of all the innovation funnels and ideation laboratory paraphernalia. What is clear to me after 15 years in the world of telecommunication service innovation is that everyone has the same ideas, the difference between success, failure and doing nothing (the most frequent case) is the quality of the execution process.

(b)   Innovation requires iteration since nobody gets it right the first time.

This principle is often associated with the motto:  fail fast to succeed sooner .  In the Lean Startup world, it leads to the MVP: minimum viable product. Each word is important: a MVP is a product that may be placed in our customers’hands (this is not a prototype, it may be simple but it should not be fragile). A MVP is “viable” when it solves the customer’s problem. Its role is to jumpstart an iterative process of feedback collection, which may only happen if the customer finds a practical interest with the MVP, on the first day. A MVP is “minimal” because it is “as simple as possible but not simpler”, to paraphrase Einstein. This allows us to start the iteration as soon as possible, but not sooner. This emphasis on iteration echoes what a venture capitalist from Silicon Valley told me six years ago: there is no correlation between the success of a software startup and the quality of the piece of code that is shown to the early investors. On the other hand, there is a clear correlation between success and the ability to listen to the feedback of early customers and turn them into improvements. 

This is also a bigger difference than one may think with the prevailing culture of large companies. It goes against the myth “you must get it right on the first time; you have only one chance to make the right impression”. The common culture of detailed market studies, coupled with the practice of lengthy marketing requirements, is replaced by a “hands-on” culture. MVP is a process that co-constructs software code, requirement and detailed specifications at the same time.

(c) A successful business model is built iteratively using customers’ feedbacks.

A successful business model is not a pre-condition but a post-condition for the innovation process. A startup is a “business model factory”; this is well understood today by the various startup “incubators” and “accelerators”  and it may be acknowledged as one of Eric Ries’ contributions. To make a “business model factory” deliver, one needs three things. First, we need to set up measurement points in our MVP. We need to measure usage and value creation, that is, how the problem is being solved. Second, we need to build and then validate a value creation model, which Eric Ries calls innovation accounting. This is the direct application of the old saying “a measure is worth nothing without a model” (without a model, one does not known how to interpret a measure). This is an iterative process and not an exact science, where trials and errors is the common approach. On formulate hypotheses, which are either validated on invalidated by the collected measurements. Eric Ries is adamant in his book about preferring facts to opinions :). Last, when the model fails, the startup needs to “pivot”, that is to formulate a new value creation hypothesis. A key contribution from The Lean Startup is the wealth of examples and explanations regarding business models and pivoting. 

This third principle is no less of a rupture with respect to the sanctity of the business case and its return on investment (RoI) that is observed in many large companies. It is simply not possible to formulate a credible business case when one starts to innovate. Obviously, one needs to start somewhere, hence there must be some initial hypotheses regarding value creation. However, the business model for the MVP is the result of an iterative process; the good news is that it comes with the validation provided by usage measures.


2. Lean Innovation Factory


I have started to use the term « Lean Innovation Factory » as a way to encapsulate principles from The Lean Startup applied to the innovation division of a large company, such as the one that I manage at Bouygues Telecom.  The name Lean Innovation Factory (LIF) captures three ambitions:

(1)    It is an innovation factory.

A “lean innovation factory” is a process that produces innovations. An innovation is a product or service that solves a problem, which is demonstrated in the hands of a customer. The process does not need to deliver a full-scale solution to prove its effectiveness, it can operate on a smaller set of customers, but only the “monitored feedback” of real users will validate the creation of innovative value. The emphasis is on “doing” and “building”; ideas have no glorified status in the Lean Innovation Factory, we strive for physical products and running software. We make ours the words of W. Edwards Deming : “In God we trust, all others must brings data”.

(2)   It follows the “Lean Startup” principles.

The engine for creating value is the iteration of MVP feedback, which means that we strive to build the first MVP as quicky  as possible (fail fast to succeed sooner), but while keeping the meaning of “viable” into our minds : the MVP is not a prototype, it is a product. We implement the heart of innovation accounting, in the sense that we measure feedback and we build continuously a value creation model that is validated or invalidated by our users.

(3)    As a “factory”, the process is as important as the end result, because the result keeps changing while the strengths and the skills of the “factory workers” may build up.

This is the same pitch that I made for the “lean software factory”, and a reason for choosing a similar name :) To build a lean innovation factory is not only to build great product or service innovations, it means to build an organization that learns to do this better and better over time. This is clearly what Eric Ries tries to teach from his own experience with many startups, and where the link with "The Toyota Way" is the most evident.






Sunday, July 14, 2013

Follow-up on Lean Architecture

This post is a sequel to the previous post regarding the lean architect. It’s main topic is the book review of “Lean Architecturefor Agile Software Development”  from James O. Coplien & Gertrud Bjørnvig  This is really a follow-up in the sense that I have found most of the ideas from my previous post expressed in this book, but they are more thoughtfully presented :).  It also goes further than my previous analysis, which is why I am writing this quick – and incomplete – book review. James Coplien  is both a prolific author and a serious expert on software architecture, with an itinerary that is not so different from mine, especially with respect to object-oriented programming, which use to be my own domain of expertise twenty years ago. Similarly the 10 years which I have spent working on information systems architecture from 1997 to 2007, are well mirrored by Coplien track record in system’s architecture in the 90s and 00s.

The first key idea of “Lean Architecture” is the reconciliation of agile and architecture, because of the increase of scale for projects that agile methods are addressing today. “Extreme Programming (XP) started out in part by consciously trying to do exactly the opposite of what conventional wisdom recommended, and in part by limiting itself to small-scale software development. Over time, we have come in full circle, and many of the old practices are being restored, even in the halls and canon of Agiledom”.  On page 161, one finds a nice graph from Boehm and Turner that shows the effect of size on the need for architecture.  As usual with Boehm, this is a data-derived graph that captures one’s intuition:  once a software project becomes large, anticipation and forethinking is required.
There is no surprise here. On page 15 we read “Ignoring architecture in the long term increases long term costs”, which any one with gray hairs knows firsthand. The insight from this book is that the lean contribution to agile is to bring back the long term and systemic focus into agile, hence the need for architecture.
Coplien and Bjørnvig bring a fresh and simple vision of lean software development, which is summarized in the book by two formulation of the same principle:
  • Lean = All hands on deck + long term planning
  • Lean = Everybody, All together, Early on

I would not say that “The Toyota Way” could be fit into such an equation, but it captures an important part of what lean management is about, and it’s relevant to emphasize the long term vision which is indeed a key aspect from lean. It also helps the authors to introduce a (creative) tension between lean and agile:
  • Agile is oriented towards change and organic complexity. For instance it is good to defer decision (this is not new or “agile per se” since Knuth is quoted to have said “premature optimization is the root of all evil”).
  • Lean is focused towards large-scale complexity (including “complication”)  and long –term resilience. Hence the focus on standardization and bringing decisions forward.

This distinction is presented in a full table page 12, and, frankly, I disagree with most of its rows. For instance, team versus individual is not a good characterization of lean versus agile. Nor is the complicated versus complex debate … and the “high throughput (lean)” versus “low latency (agile)” debate is too restrictive.  My views are expressed in a previous post and I think that there is much more in common between lean thinking and agile thinking. Still, I would definitely agree that the management of time, the systemic and holistic view, and the overall long-term perspective is different between lean and agile. Clearly, the goal of lean software development is to combine both, and this is where the extra emphasis on architecture comes from.  It also explains the emphasis on maintainability, as expressed by this quote from Jeffrey Liker about  lean:  a culture of stopping or slowing down to get quality right the first time to enhance productivity in the long run”. The tension between lean and agile is helpful to explain why one needs both the systemic practice of “Five Whys” in the search of root cause while problem solving, together with the frequent feedback and adaptive cycles. Indeed the complexity and the rate of change of our current environment require both: complexity demands the 5 whys, but the rate of change means that this is not enough, quick feedback loops are required.
An interesting development is this book deals with Conway’s law. This principle states that there is a strong dependency between architecture and organization. That is, the system architecture (module organization) will be strongly influenced by the management organization (teams & departments) that produces the piece of software. Conversely, things work best (from a management perspective) when people organization follows the system architecture. Coplien derives two consequences from this law, which strike me as truly relevant:
  • “Manager are über-architects, a responsibility not to be taken lightly” … here goes the myth that one may do a good job at managing software developers without a keen insight about software architecture :)
  •  “There is a need for modularity in large-scale system” – this is obviously true … and indeed a consequence of Conway’s law. The goal of modularity is to keep changes local, and since it is unfortunately mandatory on the people side as soon as the scale of the projects tips on the “large side”, modularity in the organization must translate into modularity in software, hence the need for architecture :)

This post will not do justice to this excellent book, which is full of wisdom. I do not have the time to collect all pearls, such as “Remember that architecture is mainly about people” or “Software development is rarely a matter of having enough muscle to get the job done, but rather of having the right skill sets present”.  I refer you to my book for my own two cents of wisdom about information system architecture and management. Still, it worth repeating that architecture is foremost a communication tool to manage change, and not a blueprint for a better word.  One of my favorite quote from “Lean Architecture” is the following: “The customer has a larger stake in the development process than in the service that your software provides”.  This is the crucial idea that is at the heart of the “Lean Software Factory”. In a dynamic and changing world, software is not an object, it is a process. Qualities such as evolvability, modularity and openness come from the people and the development process, much more than they apply to a finished product.
This brings me to my second favorite quote, page 131: “The essence of “Lean” in Lean architecture is to take careful, well-considered analysis and distill it into APIs written in everyday programming language”. Here we see the long-term/resilience thread of lean thinking brought into software development in a very practical manner. This quote expresses two key ideas into one sentence:
  • Being “future-proof” is having the right APIs so that the code may be both extended easily as well as reused in different manners from which it was intended in the first place.
  • Defining, I would rather say “growing”, the right set of APIs is an art; it requires practice, wisdom and aesthetical judgment. Still it required forethinking and analysis, which is architectural thinking.

This should be enough to reconcile agile programming with architecture, if there ever was a need. Coplien and Bjornvig spend a few delightful pages debunking “Agile Myths” such as:
  1.  You can always refactor your way to a better architecture”. Refactoring is crucial because software is a live object that evolves constantly as its environment changes. Agile methods, like any iterative development processes, are bound to produce “accumulations” that need to be cleaned up.
  2. We should do things at the last moment”.
    Here we find the “anticipation versus as-late-as-possible” debate that was mentioned earlier.  Many authors, including Mary Poppendieck, consider that lean thinking translates into taking design decision as late as possible. This book does a good job at balancing the arguments. There is no single answer, but there is a strong plea for “thinking ahead” and “preparing oneself” through architectural design.
  3. Agile = “don’t do documentation”.
    The agile tradition of no comprehensive documentation that gets obsolete before it is used still stands. However, as soon as scale grows, and as soon as the life expectancy of the software piece grows, there is a need for serious documentation. There are many tools to automate part of the documentation task, especially that which is closely linked to the code. A large piece of software requires storytelling, and this has nothing to do with software, it’s a consequence of human nature and what is needed to motivate a large group of people (see Daniel Pink).


User stories are an integral part of Agile/SCRUM development methods. There are a few interesting pages in this book that show the link between user stories and business processes, which are commonly associated to large-scale waterfall development process.  The last part of the book deals with DCI (Data, Context and Interaction), a framework that extends MVC and which is proposed as the proper foundation for designing modern distributed object-oriented systems. This topic is out of the scope of this post, although I find many similarities with the design philosophy of the CLAIRE programming language. Some of the key insights may be summarized as the need to recognize and separate the need for complex algorithm from object classes (“Does this mean that procedural design is completely dead ? Of course not. There are algorithms that are just algorithms”), the reification of roles and business processes and the use of context objects to develop functional polymorphism (hence to share and reuse more code), a practice that reminds me of my early day at Bouygues Telecom when we created the PTEs (Processus Techniques Elementaires) – cf. my book on Urbanization.

I will conclude with two quotes from page 92:
  • “Let the human considerations drive the partitioning [architecture], with software engineering concerns secondary”.
  •  “A complex system might use several paradigms”.




Friday, May 10, 2013

Systemic Simulation of Smart Grids (S3G) - Part III

This post concludes the first phase of my computational experiments with S3G (Systemic Simulation of Smart Grids) which I run during 2011 & 2012 summers. I presented the results at the 2013 ROADEF conference a few months ago and I have made the extended set of slides available on my box account (left of the blog page – My Shared Documents).
 1. S3G Experiments

A simple description of S3G is available in a previous post. It should help to understand what is presented in the slides, since many of those slides were included in that post. The objective of S3G is to simulate the production and consumption of electricity throughout a long period of time (15 years) with a global “ecosystem” perspective.
I will first add some explanations to three topics: the set of models, the satisfaction model and the GTES search for equilibriums. It is important to understand the limits of the current experiment before giving out the preliminary findings, since they need to be taken “with a grain of salt”.

As mentioned earlier, S3G uses very simple models (i.e. simple equations and few parameters) for the component of the “energy production & consumption” complex system. This is a deliberate choice, because I lack the expertise to produce more complex sub-models, and, mostly, because I want to focus on the overall system complexity (that is, what happens when all this simple subsystems are put together). This is clearly explained in my SystemX IRT introduction keynote. Still, it’s worth taking a look at each of these sub-models:
  •  The energy demand generation is quite simple. I start with daily and yearly patterns, obtained by cut & pasting historic curves found on the web and I had random noise, which I can control (time or geography-dependent). I don’t think that this is a limitation for this experience.
  •  I have a crude vision of “NegaWatts” that represent energy consumption that may be saved through energy saving investment. NegaWatts are virtuous: there is no reduction of economic output, but they require money. Here my model is really too simple, but somehow it falls outside the scope of what I was trying to accomplish. I use a simple hyperbolic function to represent the fact that, as electricity prices grow, people are likely to invest to try to reduce their consumption. Since it is very difficult to foresee the negawatt development in the next 20 years, it is better to use a single parameter (slope of the hyperbole) and make it vary to cover all kinds of scenarios.
  •  I have an equally crude model for demand/response, which, contrary to Negawatts, is instantaneous but affect economic output. In my model (a simple S-curve), demand is reduced when peak price becomes too high. We’ll see later on that this is indeed too simple and that it should be further developed in a future next step.
  •  The market share model – to determine the market share of the grid operator against the incumbent – is a simple/classic S-curve. My previous experience with similar economic simulation tells me that it is enough to produce a realistic experience (this is not the the systemic complexity lies).
  • On the other hand, the dynamic pricing model - how does the incumbent modulate his wholesale/retail price - is the heart of the relationship between the local and the national operators, and my current version is too simple. I assume that the price is a function of the output (demand), so that peak consumption yields a higher price. I have chosen a very simple function for my dynamic price equation: a piece-wise linear function, with a constant price up to a fixed (constant) production, and then a linear surcharge when the production is higher than this constant. Obviously, one would like to test and analyze more complex dynamic pricing schemes, since dynamic price and demand/response behaviors are a key engine for smart grids. The reason why I used a simple model is that this is precisely the complexity-generator for the model, and using a randomly-complex model makes it very difficult to analyze later. This pricing structure is under the control of regulation, and I am waiting to better understand what our political instances have in mind to encode a richer model (see later).
  •  Last, the smart grid electricity production model is reasonably detailed for such an experiment. The decision about which source of electricity to use is actually straightforward due to the mix of production constraint (one must use the electricity that is produced) and economic goals (when sourcing, get the cheaper source). The only tricky part is the management of storage. I use simple rules, with a number of parameters that are tuned within the machine learning loop of the GTES simulation. Hence I let the simulation engine discover how to best use local storage. For instance, the local operator can both use storage as a “buffer” for its own production or as “reserve” to play the market (buy when cheap and sell when expensive). When I consider the small amount of storage that is actually used (because of storage price), this part of the model is quite satisfactory.

A GTES run is the simulation of an optimization loop that tries both to maximize each player’s satisfaction and to find a Nash equilibrium. Hence, defining each player’s satisfaction is a critical part of this S3G. Let us first recall that there are four players (actually, set of players) in this “game”. Each player has three goals, with a target value that is associated to each goal. We define a “strategy” as a triplet of pairs (goal, target). The satisfaction is then expressed with respect to each goal with a pseudo-linear function: 100% if the target value is reached and a linear fraction otherwise. The overall satisfaction for a strategy is the average satisfaction with respect to the three goals of the strategy. In the GTES method, we separate the parameters that represent these goals (grouped as a strategy) from the other parameters that the player may change to adjust its results, which we call the “tactical play”. The principle of the GTES game is that each player tries to adjust its “tactical parameters” to maximize its “satisfaction” (w.r.t its strategy). Here is a short description of the four players in the S3G game:
  • The “regulator” (political power) whose goal is to reduce CO2 emissions while preserving economic output and keeping a balanced budget (between taxes and incentives). Its three “goals” are, therefore, the total output (consumed electricity + negaWatts), the amount of CO2 and the state budget (taxes - subsidies). Its tactical play includes setting up a CO2 tax, regulating the wholesale price for the suppliers and creating a discount incentive for renewable energies.
  • The existing energy companies, here called “suppliers”, whose goal is to maintain their market-share against newcomers, maintain revenue and reduce exposure to consumption peaks. Their tactical play is mostly through pricing (dynamic), but they also control investment into new production facilities on a yearly basis.
  • The new local energy operators, who see “smart grids” as a differentiating technology to compete against incumbents. Their goal is to grow turnover, EBITDA and market-share. Their real-time tactical play is dynamic pricing, and they may invest into renewable and fossil energy production units, as well as storage units.
  • The consumers are grouped into cities, whose goal is to procure electricity at the lowest average price, while avoiding peak prices and preserving their comfort. The cities’ tactical play is mostly to switch its energy supplier (on a yearly basis) and to invest into “negaWatts”, which are energy-saving-investments (more energy-efficient homes, etc.).


GTES stands for Game-Theoretical Evolutionary Simulation. I have talked about it in various posts, and a summary may be found here. I gave a keynote talk about GTES at CSDM 2012, the slides are available here. GTES is a framework designed to study a model through simulation, in order to extract a few properties from this model (learning through examples), either explicitly or implicitly.  GTES is based upon the combination of three techniques:
  1.  Sampling:  since some parameters that occur in the economic equations are unknown, we draw them randomly from a confidence interval, using a Monte-Carlo approach. Monte-Carlo simulation has become quite popular (especially in the finance world) over the last decades (while computers became more powerful, obviously). The need for Monte-Carlo is a signature of complexity and non-linearity: simulation becomes necessary when one cannot reason with averages. The beauty of linear equations is precisely that one may work the average value. In a complex non-linear system, deviations are amplified and there is no other way to predict their effect than to look at it, case by case (hence the sampling approach).
  2. Search for Nash Equilibrium in a repeated game: We set the parameters that define the player’s objective functions and look for an equilibrium using an iterative fixed-point approach (in the tradition of the Cournot Adjustment). The good news with S3G is that it is a “simple” complex system, hence finding a Nash equilibrium is easy. However, it is precisely easy because of the simple pricing model (cf. previous discussion).
  3. Local Search as a machine learning technique: once the parameters that define the objective function are set, the other parameters that define the behavior of each player may be computed to find each player’s “best response” to the current situation. We use a simple local search (“local moves” = dichotomic search for the best value for each tactical parameter), coupled with “2-opt” : the random exploration of moving two parameters at the same time, using “hill climbing” as a meta-heuristic. From an OR point of view, this are rudimentary techniques, but they seem to do the job. The complexity of the optimization engine that one must embed into GTES depends on the complexity of the model. If the dynamic pricing model was made more complex, a stronger local search metaheuristic would be necessary.

Explaining GTES will take many years … I was invited at ROADEF’s yearly event last month to present some of the successes that I have had with this approach over the past 10 years. I have a book, “Enterprises and Complexity: Simulation, Games and Learning” in my “pipe”, but I expect at least five more years of work are needed to get it to a decent state (in terms of ease of understanding).

2. Most interesting findings with S3G experiments

A “S3G session” is made of interactive runs of “experiments”, which are GTES computational executions. More precisely, an experiment is defined through two things:
-          The randomization boundaries, for those parameters that will be sampled.
-          Some specific values for some parameters, since the goal of a “serious game” is to play “what-if scenarios”, by explicitly changing these parameters. For instance, we may play with the investment cost of storage, to see if storage is or will be critical to smart grids.

Multiple scenarios have been played to evaluate the sensitivity to “environmental parameters such as the variability of energy consumption (globally or locally), the fossil energy price (gas and coal), the possible reduction of the nuclear assets, the impact of carbon taxes or the impact of wholesale price regulation. Here is a short summary of the main findings that were presented at ROADEF:
  •  Smart Grids and variability.One theoretical advantage of smart grids operator is that they could react better to variations. Simulation does show some form of better reaction from the local operator than the national operator to either fluctuation (electricity demand that varies compared to historical forecast) and local variation (for instance, through local changes of climatic conditions). However the difference is very small, and could be disqualified as insignificant from a statistical perspective. This result depends on storage price (see later) and wholesale price structure. With the current values, one of my key arguments in favor of smart grids (systems of distributed systems are expected to be more flexible and reactive) does not seem to hold.
  •  Carbon tax and Nuclear strategy.
    I played with carbon tax to see if the raise of carbon tax would have an effect, and it does, but it is a negative one since it favors nuclear energy and since green energy is still too expensive. On the other hand, the decision to reduce the share of nuclear energy in the national supplier (either for a long-term withdrawal or a long-term cap as announced by the French government) creates favorable conditions for smart grid operators, quite logically. However, simulation shows that the results are weak (small advantage) and unstable (they depend heavily on the overall systemic equation of wholesale prices coupled with “environment” variables such as energy prices).
  •  Storage and Photo-voltaic costs.As explained earlier, I used the Web as my main information source, and got unit prices for storage (per MWh) and photo-voltaic that vary considerably according to the sources. I designed a number of scenarios to see what would happen if the prices fall down, as is expected by a number of “green experts”. The availability of cheap storage has an important impact, but one need to see a price reduction by a ratio of 5 to 10 (depending where you place the start point) to see this impact materialize. The simple rule seems to be that storage TCO (total cost of ownership) should get as low as 50% of wholesale price to shift the system’s behavior (quite logical if you think about it). A similar remark may be made about Photo-voltaic energy, which price is still far too high to change the smart grid operator economic mode.
  •   Wholesale & retail price structure.This is the heart of the smart grid ecosystem: the rules/regulation that governs wholesale pricing – which controls the “coopetition” between supplier & operator – and the dynamic pricing for retail, which controls the benefits driven both from demand/response and negawatts. In the game theory tradition, we have built a strategy matrix that shows the result of conflicting strategies between the supplier and the operators, ranging from bold (focus on market-share) to aggressive (focus on revenue) through “soft” (more conservative). The sensitivity to the price regulation structure is such that it does not make sense to draw too much out of my simulations, except the fact that this is the critical part.
  • Sensitivity to oil price.I have played with a number of scenario regarding fossil energy price trends in the next 15 years. The sensitivity is much lower than expected, when comparing suppliers against operators. There is a clear impact on consumers, but the benefit in favor of green energy and smart grid operators is offset by the advantage in favor of nuclear energy. One may add that with shale gas, non-conventional oil and coal, this type of scenario is not likely in the next 15 years. What we see in the US is precisely the opposite.


3. Limits of current approach and next steps

Let me first summarize three obvious limits to the S3G approach:
  •  As explained, both wholesale and retail dynamic pricing models are too simple. The shape of the curve is simplistic, but also the fact that price only depends on total demand is unrealistic (taking production costs into account is a must).
  • One of the expected benefits of smart grids is improved resilience, both to catastrophic events and to significant internal failures. I did not try to evaluate resilience, because I did not have enough data to generate meaningful scenarios. If you look at what is happening in Japan, local storage is deployed to increase resilience in the advent of a natural catastrophe, with good reasons (together with HEMS: home energy management systems).
  • My demand/response model is equally too simple, from two separate perspectives. First I only look at “shaving”, that is electricity that is not consumed, because the usage is forsaken for price/availability reasons. Another interesting alternative is to look at demand displacement, where the consumptions is “shifted” instead of “shaved”. Many usages, mostly related to heating, have enough inertia to be shifted by a few minutes. The other simplifying dimension is that I only look at the instantaneous benefit brought by demand/response that is the non-consumption of electricity at a time when prices are high. However, market prices do not raise that much, nor long enough, to make this “shaving” worth a lot of effort. On the other hand, it may help to avoid investing into excessive marginal capacity, which has a higher payoff.


This last argument is pointed out in the « Loi Brottes ». This article explains clearly the difference between the “capacity adjustment” value creation and the “production adjusting”.
  •  "En l'état actuel du droit, aucun mécanisme n'est prévu pour rémunérer l'effacement au titre de sa valeur en capacité entre fin 2013 et l'hiver 2015-2016, autrement que par le biais du mécanisme d'ajustement, ce qui limite le développement des capacités d'effacement", indique-t-il dans l'exposé des motifs. Cet amendement a donc pout but d'assurer dès l'entrée en vigueur de la loi "le développement des effacements par un dispositif d'appel d'offres, dans l'attente de la mise en place du mécanisme de capacité pérenne qui permettra aux acteurs concernés de développer des capacités de production et d'effacement de consommation".

The reason why I focus on « production adjustment » is that it is much easier to simulate. Capacity adjustment is a three-parties value proposition (the user, the demand-response operator, and the producer whose capacity may be reduced). It requires a regulation (hence the Brottes law), to shift the capacity avoidance into operational benefits for the operator who will eventually share it with the user.

I will leave the S3G code alone for a while. When I resume this work (2014), I plan to take the following next steps:
  • Improve the satisfaction formula, using a product form instead of a sum. This is a classic technique when defining KPI for performance measurement. A product (i.e., multiplying the various sub-terms of section 1 instead of summing them) yields a more truthful representation of strategic satisfaction (it is much better to reach all three goals at 60% than getting 100% on two and totally missing the third).
  •  Introduce parallelism (with a MapReduce architecture) to reach more stable results with more samples. Monte-Carlo simulation is designed for easy parallelization.
  •  Enrich the dynamic pricing model (while sticking to piecewise-linear formulas) and re-evaluate the “model constants” (energy production and storage prices, which constanly evolve).





 
Technorati Profile