All Systems Need A Little Disorder

Tradeoffs in Managing Complex Adaptive Systems

CAS Balancing Act

Managing complex adaptive systems (CAS) is a constant balancing act between many goals that are often in conflict with each other. Most systems need to be managed to limit resource usage i.e. they need to be efficient. The mantra of efficiency is most visible in economic systems but even other systems such as biological systems cannot typically get away with profligate wastage of resources. But efficiency is not enough. Systems also need to be robust. There is no point in achieving optimal efficiency if the first change in environmental conditions causes the system to collapse. Robustness is not just about avoiding collapse and persisting. Robustness is the ability of a system to maintain critical functionality in the face of significant stress. Robustness is concerned with the maintenance of functionality rather than the preservation of specific system components.

Apart from being efficient and robust, most systems also need to be able to innovate and generate novelty, an ability that the evolutionary biologist Andreas Wagner has referred to as innovability. Here, it is not the ability of the system to make incremental, quantitative improvements that we are concerned with. Innovability refers to the system’s ability to generate disruptive, qualitative and fundamental improvements which enables it to undergo transformational change. Innovability refers to the system’s ability to produce ‘game changers’1. It is this combination of robustness and innovability that I refer to as resilience.

Resilience

Control initiatives targeted at CAS often equate robustness with stability. In this perspective, there may not even be a tradeoff between efficiency and robustness. The essence of control is the elimination of disorder and efficiency and stability are both fundamental to control. To the extent that the control process and the model that guides this process is accurate and reliable, control itself delivers stability. To the extent that the model is inaccurate i.e. to the extent that the map does not match the territory, the system needs to maintain some slack and redundancy to buffer against environmental changes that are not anticipated by the model.

Yet it is often observed in complex adaptive systems that stability leads to fragility. Extended periods of stability and, by extension, policies that focus on stabilisation frequently end in collapse. Examples of this phenomenon are plentiful. Suppression of forest fires in the United States has transformed forests across the country into a veritable tinderbox prone to increasingly catastrophic fires. Levees and embankments on many of the world’s rivers may have succeeding in controlling their path in normal years but only at the expense of an increasing frequency of catastrophic floods. The economic ‘Great Moderation’ that the developed economies enjoyed since the 80s turned out to be the prelude to the deepest economic recession since the Great Depression.

The Territory Becomes The Map (Almost)

The conventional debate on the fragility of complex adaptive systems in the modern era and the reasons for this control failure focuses on the inadequacy of the models/processes used within the system in capturing the “true” complex reality and/or the hubris of the system participants/managers on the utility of these flawed models. For example, much ink has been spilt blaming our economic crisis on everything from the efficient markets hypothesis, the state of financial modelling (Black-Scholes, VaR etc) and the dominance of rational expectations within macroeconomics. This is an oft-repeated criticism of control projects (especially centralised control projects) summarised by H.L. Mencken: “For every complex problem there is an answer that is clear, simple, and wrong”. This argument does not imply that the controlled, stabilised system is any worse off than the unstabilised system or indeed any less sustainable. Maybe a better, more realistic model will help us avoid system fragility. Maybe we need decentralised control or simply better control process technology.

The implicit assumption in this worldview is that control and management of complex adaptive systems starts by first figuring out a model that captures the essential dynamics of the system. The fatal flaw in this logic is that real-world control projects do nothing of the sort. As James Scott has argued, the first step of control is the simplification of the system itself to reduce uncertainty and render it controllable.

Control strategies (technological and institutional) are dependent upon a redesign of the domain that needs to be controlled. The aim of this redesign is to reduce uncertainty and complexity in the domain such that control technologies and institutions are able to operate effectively upon the domain. Simplifying the system from an illegible and complex entity to an easily measurable and legible entity enables a codified, centralised and discretion-free control regime to be put in place. Similarly, automating and replacing the human role in a domain frequently involves a redesign of the operating environment in which the task is performed to reduce uncertainty, so that the underlying task can be transformed into a routine and automatable one. This redesign of the work environment to reduce uncertainty lies at the heart of the Taylorist/Fordist logic that brought us the assembly line production system and has now been applied to many white-collar office jobs. Herbert Simon identified this when he observed2: “If we want an organism or mechanism to behave effectively in a complex and changing environment, we can design into it adaptive mechanisms that allow it to respond flexibly to the demands the environment places on it. Alternatively, we can try to simplify and stabilize the environment. We can adapt organism to environment or environment to organism”.

The picture below taken from James Scott’s book ‘Seeing Like A State’ illustrates the radical nature of the simplification inherent to the control project in German forestry during the late eighteenth century. Prior to the second half of the eighteenth century, German forests were messy and complex constructs. Fire was a regular occurrence thanks to the prevalence of slash-and-burn agriculture and the forests themselves contained a diverse mix of trees, plants and animals. The emergence of modern German forestry changed all that3. The number of species was reduced with the trees often planted simultaneously in a monoculture and wildfires were suppressed, with the primary aim of maximising the yield of timber that could be extracted from the forest.

Simplification of German Forestry

The simplification of the domain is as much a prerequisite of control as it is a consequence of the control effort. During the initial stages of the control project, simplification and implementation of control technologies work in a positive feedback loop where simplification aids control which further simplifies the domain. Even in domains where simplification is not an explicit aim, it is often the result of the control effort. For example, modern medicine probably doesn’t set out to tamper with the natural, complex dynamics of the human body. But it often does exactly that – by using an implicit map of disease as a battle against a single microbe or chemical imbalance, the process of medication transforms the body into something resembling exactly that. Antibiotics, by erasing the bacterial population of the body, transforms the complex ecological dynamics of the human microbiome into a straight “shoot-out” between the drug and the harmful bacteria.

Scott’s central argument is about much more than the the inability to capture the complex interactions and non-linearities within the system. What causes the failure pattern, the ‘control treadmill’ described in the next section, is not so much the failure of the simplified control regime to capture the complex reality but the success of the simplified control regime in moulding and shaping reality such that it mimics the simplified control template. The drive to simplify the system, to make it legible4 is an integral component of the control project. This transformation from illegibility to legibility simplifies the system to a shell of its former complex self.

Control and Simplification

If the transformation of the system from a illegible, complex domain to a legible and simplified domain could be achieved in a “perfect” manner where all irreducible uncertainty was banished from the system, then the system would not become fragile. It is the incomplete and imperfect manner of this transformation, the fact that the territory “almost” mimics the map, that triggers an unerringly common pattern of control failure.

Control Treadmill

The problem with the idea of perfect control is not that it fails. The problem is that the control project seems to succeed at first and then fails. If all attempts at control simply led to immediate systemic fragility, then we would simply reverse our efforts and restore the system. But the reality is nowhere near as simple. The pattern of control goes through various stages that make the system almost irreversibly fragile and increasingly uncontrollable. The progression resembles a treadmill from which it is increasingly difficult to exit without risking systemic collapse.

Initial Success Followed By Increasing Fragility and Fall in Productivity

Initially, the shift of German forests towards a monoculture of a productive species (such as the Norway spruce) was spectacularly successful and productivity and timber yield soared. It look as long as a century from the initial intervention for the negative consequences to appear and for productivity to start falling. The same monoculture that had provided a stable and productive environment in the beginning of the control initiative eventually led to the diminished nutritive capacity of the soil and an explosion in the pest population.

During the initial years of fire suppression, the fires suppressed tend to be small and inexpensive to put out5. But by the 70s, prolonged fire suppression had transformed many forests in the United States into fuel-heavy, tightly coupled and homogeneous entities stripped of any fire resistance and prone to catastrophic fire at the smallest disturbance.

Similarly, macroeconomic stability in the 50s and 60s was enabled by the deleveraged nature of the American economy post the second World War. Even in interventions by the Federal Reserve in the late 60s and 70s, the amount of resources needed to shore up the system was limited. But the productivity growth of the 50s and 60s has been followed by a systematic downturn in productivity growth since the 70s6 and a financial system prone to increasingly catastrophic collapses.

Increased Cost Of Control and No Easy Way Out

The increased fragility of German forests eventually forced forest managers to to deploy increasing outlays of fertilisers and pesticides to prevent system productivity from collapsing and maintain stability. After the initial low costs of fire suppression, costs and total area burned have ballooned in the last three decades7. Similarly, effective macroeconomic stabilisation and maintenance of economic growth that only required a cut in interest rates by the Fed in the 80s now requires the deployment of trillions of dollars in monetary and fiscal stimulus.

Managing the system in this increasingly fragile yet costly state is akin to choosing between the frying pan and the fire. By moulding the territory to fit the map, the control process changes the fundamental nature of the system in such a manner that even after a systemic collapse, the prior resilient state is unlikely to be attained. In the context of our financial system, the evolution of the system means that turning back the clock to a previous era of resilience is not an option. For example, re-regulation of the banking sector is not enough because it cannot undo the damage done by decades of financial “innovation” and adaptation by the banks in a manner that does not risk systemic collapse.

Can Perfect Control Be The Solution?

Partly due to the difficulty of turning back the clock, system managers prefer to focus not so much on prevention but on effective medication – as much medication as is needed, to be delivered in a rapid and targeted fashion. As the system loses its ability to cope with disturbances on its own, progressively higher doses of the medication are required. At some point, the costs of this increasing medication may be prohibitive. To some extent this is the story of fire suppression. The costs of monitoring and stamping out each fire at source simply become too prohibitive after prolonged periods of fire suppression. Ultimately, there is only so much that we are willing to spend to prevent forest fires. However, this does not necessarily have to be the case. The increasing medication required may plateau at a level that is not prohibitively expensive due to limits on systemic fragility. For example, forests can only accumulate so much fuel and so many fire-prone trees. Similarly, even if the human body has lost all its ability to deal with illness, maybe a sufficiently high stable dose of medication will maintain stability. Moreover, the maximum price that we are willing to pay for stable personal or macroeconomic health is probably a lot higher than the maximum cost that we are willing to pay for a stable ecology.

Not only does the dose need to be high enough, it must be delivered with sufficient accuracy and speed. Partly this is again a problem of cost. But this is also a problem of the real limits of our interventionist control abilities at this moment of time in many domains like medicine or macroeconomics. Just as the dosage increases over time in a stabilised system, the speed and accuracy with which it needs to be delivered also increases as the underlying system becomes increasingly fragile. This limitation in the control process manifests itself in a phenomenon observed in patients on psychotropic medication called ‘rapid cycling’. Patients seem to veer between euphoria and depression with barely anytime spent in the interim normal state. Similar signs can be seen in our hyper-stabilised economic system where the financial markets veer between ‘risk-on’ and ‘risk-off’ almost solely based on the imperfect stabilising interventions of the monetary and fiscal authorities.

Nevertheless, even this argument leaves open the possibility of a technological solution to the problem. Maybe stabilisation delivered in a targeted manner within a sufficiently small time of the initial illness can maintain the stability of the system. Localised, embedded microbial therapy is already a reality8. Maybe the fire suppression solution of the future will an unlimited dose of fire suppressing power at any point in the forest on an instantaneous basis. Maybe future stock-market crashes will last seconds and be countered by targeted, distributed, automated monetary stimulus implemented over milli-seconds.

The Illegibility Of The Near-Perfect Control Process

As long as the control process remains imperfect, the system remains fragile. Some of the reasons for this have been discussed in previous sections (fuel buildup, increasingly tight coupling, homogenisation etc). But as the control process approaches ‘perfection’, the increasingly complex nature of the process itself now becomes a cause of systemic fragility and somewhat ironically, causes the system to become illegible again.

As James Beniger has observed9, the information revolution is better understood as a continuation of the control revolution that started two centuries ago. This ‘control revolution’ encompassed a lot more than just technological control via the advent of information processing and communication technology. Equally critical was the role of institutional innovations in control (e.g. F.W. Taylor’s ‘scientific management’) that lead to the rapid growth and dominance of the modern bureaucracy. However the information revolution has enabled a far greater complexification of the control process itself. It is tempting to respond to each successive control failure with a more complex control algorithm and more data. Effective control that avoids collapse requires increasingly complex algorithms and larger, more accurate sets of data. The algorithmic and computational complexity of the controlled system explodes and becomes impossible to comprehend for the human operator in charge even though the human operator is still responsible for monitoring and avoiding catastrophes and still needs to rely on his judgement for this task.

James Scott argues that complex systems need to be managed in a decentralised manner dependent upon the tacit and intuitive knowledge of those at the heart of the system. But the new algorithmised complexity that humans are expected to monitor and intervene when things go wrong is simply beyond the intuitive understanding of humans. Whereas the earlier illegible system was at least comprehensible to human intuition, the new algorithmic, data-heavy system is not. In this manner, the system comes full circle. The complexity and illegibility of domain uncertainty has been replaced by the complexity and illegibility of the control process itself.

Illegibility of Control

All Systems Need A Little Chaos

If our aim is simply perfect stability and order, then perfect control (if attainable) is sufficient. Our failures in most complex domains such as medicine and economics can be seen as failures that can and will be surmounted with technological and institutional innovation. But there still remains one insurmountable problem with the idea of perfect order and control. The state of perfect order, even if it is achieved and maintained, is a pathological dysfunctional state in which the system loses its ability to generate and cope with novelty. Innovation and evolution are disorderly rather than equilibriating processes. For example, economies innovate via a process of creative “destruction”. The Soviet economic project failed in no small part due to its success in achieving stability and its corresponding failure in innovating and generating novelty.

Resilient systems are simultaneously robust and innovable. At first glance, innovability and robustness appear to be incompatible. Robustness requires that the old functional capabilities of the system be preserved until new capabilities are found. Innovability, on the other hand, involves a leap into the dark, an exploration that carries with it a significant risk of failure. Yet, many real-world systems – biological, ecological, economic – are able to resolve this tension and do so without an intolerable sacrifice in efficiency.

Micro-Fragility is The Key To Macro-Resilience

Resilience has no meaning unless we specify the level of analysis. You can only choose resilience at one level, not at all levels. For example, a central banker should be concerned with the resilience of the macroeconomy, not of its individual firms. A CEO should be concerned with the resilience of his firm, not of its individual departments. An individual human being is most concerned with the resilience of his body, not of the individual cells within it. The apparent conflict between evolvability and robustness arises from a fallacy of composition where system resilience is assumed to arise from the resilience of its constituent parts, when in fact it arises from the fragility of its parts. In other words, macro-resilience arises not from micro-resilience but from micro-fragility.

For example, one of the critical preconditions which allows biological systems to be robust and generate novelty is that the constituent parts i.e. the genetic makeup of organisms (known as the genotype) are fragile and are exposed to random changes whereas the features and traits of the organism (known as the phenotype) is robust with respect to changes/mutations in the organism’s DNA/genotype10. Similarly, macroeconomic resilience is dependent upon firm-level fragility as they are forced to who provide, in the words of Burton Klein11, “the pursuit of parallel paths provides the options required for smooth progress”. Ecosystems are often characterised by robust macro-yield and productivity and significant diversity and volatility in species-level populations12.

Underground Micro-Variation and Discontinuous Macro-Transformation

Phenotypic robustness by definition implies that the phenotype does not change with most mutations in the genotype. This would seem to rule out the sort of phenotypic variability/evolvability that is required for transformative change – after all, how can new phenotypes be explored if genetic alterations typically have no impact on the phenotype? But this intuition which holds when it comes to incremental, quantitative change does not hold for transformative qualitative change. Macro-robustness enables the micro-constituents to accumulate a store of cryptic “underground” variation that enables intermittent transformative change at the macro-level.

Similarly, transformative change in ecological systems depends upon the reservoir of potential functionality that is hidden in the background13. There is significant evidence that disruptive innovation within the firm depends upon the pool of pre-adaptations available to the firm, a sort of underground technical knowledge base accumulated by the firm14. At the industry level, the pool of latent variation resides in the fringes, often within a constantly churning pool of new entrants.
Through the accumulation of underground variation, incremental micro-level change in a fragile component layer enables discontinuous macro-level transformative change in a robust manner.

Slack And Redundancy For Critical Functions Only

Most systems maintain significant slack and redundancy for critical functions and rightly so. But this does not imply that slack must be built in for non-critical functions as well. By allowing slack for non-critical functions, the system often automatically compensates for stress events and functions with no outward signs of distress even when many such non-critical functions are stressed or have even failed. The slack allows latent defects to build up, leads to tight coupling across the system and an inevitable multiple-failure. This phenomenon is called the fallacy of ‘defence in depth’15. Somewhat counter-intuitively, micro-fragility of the non-critical components aids not only the macro-system’s ability to innovate but also its robustness.

Non-critical errors, instead of being hidden by redundancies must be made visible to the system manager so that he receives accurate and timely feedback on the health of the system. Many modern automated, algorithmised systems are illegible and opaque precisely because they hide the existence of non-critical errors from the system manager through slack and automated fall-back procedures. The opacity of the day-to-day dynamics also gradually deskills the system manager who ends up having to manage catastrophic situations without gathering any experience of managing smaller failures. During the eventual multiple-failure, the manager’s job is made harder as he has to deal with not only the failures but often the complexity of the defence system itself. The buildup of latent errors has been responsible for many modern accidents. For example, Malaysia Airlines Flight 124 in August 2005 barely avoided crashing when its “fault tolerant” design enabled a failure to be hidden for a decade16.

Resilience Is A Near-Orderly State

Resilience As Near-Orderly

If resilience was simply about avoiding collapse from disturbances, then redundancy would be sufficient. But resilience is also about maintaining evolvability in a competitive, evolutionary system. Businesses need to not only protect against a temporary fall in demand for their products but against the more existential threat from innovating competitors and changes in the competitive landscape. If we need to maintain robustness as well as the ability to innovate, then combining diversity with redundancy seems to be the obvious solution. However a redundant and naively diverse system is also a dramatically inefficient and suboptimal system that can and will be outcompeted easily in most environments. It may be an avenue open to systems that are not subject to incessant competition but this is not the reality for most biological or economic systems. And moreover, we can do better.

How Systems Achieve Near-Optimal Resilience: Distributed Robustness

Although it is easy to imagine the characteristics of an inefficient and dramatically sub-optimal system that is robust, complex adaptive systems operate at a near-optimal efficiency that is also resilient. Such near-optimal resilience in both natural and economic systems is not achieved with simplistically diverse agent compositions or with significant redundancies/slack at agent level. The key to achieving resilience with near-optimal configurations is to tackle disturbances and generate novelty/innovation with an emergent systemic response that reconfigures the system rather than simply a localised response, through distributed robustness17 rather than plain-vanilla redundancy and diversity. By placing the burden of robustness on a systemic reorganisation, the cost of maintaining robustness is far below the cost that would be incurred if redundant identical “backup” resources were relied upon.

Instead of being composed of redundant specialised components, resilient biological systems tend to consist of multi-functional components. Moreover, many system functions can be carried out by any one of these multi-functional components depending upon system context. This combination of partial functional overlap and multi-functional components enables resilience without a significant investment in excess resources18 as one component can serve as an excess resource for multiple functions. Resilient ecological systems often rely on weak links1920 rather than naive ‘law of large numbers’ diversity. Species that often have only a weak role to play on average tend to have a strong and important role to play in maintaining resilience during times of stress. Partial overlap amongst multi-functional agents (also known as degeneracy) is frequently the key to resilient economic and social systems as well. But the drive towards specialisation and maintaining focus on a firm’s core competence may be too strong to allow overlapping functionalities to persist. Unlike biological and ecological systems, resilience of functionality can be maintained by a constant inflow of new entrants even when components are specialised.

Distributed robustness allows a system to maintain an apparently homogeneous and efficient system that nevertheless retains the ability to innovate and cope with a change in environmental conditions. For example, the internet search ecosystem, dominated by Google, appears to be extremely concentrated and fragile on the surface. But this ignores the constant churn at the margins and the small players ready to fill the void if Google collapses, or provide the disruptive impetus required for innovation if Google simply becomes stagnant. The diversity and slack at the fringes of the system enables a near-optimal yet resilient state.

The viewpoint which emphasises weak links, degeneracy and new entrants implies that it is not the dominant species/firms in the system that determine resilience but the presence of smaller players ready to reorganise and pick up the slack when an unexpected event occurs. In a stable environment, the system may become less and less resilient with no visible consequences – weak links may be eliminated and barriers to entry may progressively increase with no damage done to system performance when the environment is stable. Yet this loss of resilience can prove fatal when the environment changes and can leave the system unable to generate novelty/disruptive innovation. This highlights the folly of statements such as ‘what’s good for GM is good for America’. We need to focus not just on the keystone species21, but on those species at the fringes of the ecosystem.


  1. in the words of the evolutionary biologist Andreas Wagner from whom the term ‘innovability’ is also borrowed via ‘The Origins of Evolutionary Innovations: A Theory of Transformative Change in Living Systems’.  ↩

  2. Herbert Simon as quoted here by Richard Langlois. ↩

  3. Chapter 1 of James Scott’s book ‘Seeing Like a State. ↩

  4. The concept of legibility was introduced in James Scott’s book ‘Seeing Like a State. ↩

  5. In the case of wildfires, “the 10 am policy, which guided Forest Service wildfire suppression until the mid 1970s, made sense in the short term, as wildfires are much easier and cheaper to suppress when they are small. Consider that, on average, 98.9% of wildfires on public land in the US are suppressed before they exceed 120 ha, but fires larger than that account for 97.5% of all suppression costs” (Donovan and Brown). ↩

  6. ‘The Great Stagnation’ by Tyler Cowen.  ↩

  7. In the case of wildfires, “the 10 am policy, which guided Forest Service wildfire suppression until the mid 1970s, made sense in the short term, as wildfires are much easier and cheaper to suppress when they are small. Consider that, on average, 98.9% of wildfires on public land in the US are suppressed before they exceed 120 ha, but fires larger than that account for 97.5% of all suppression costs” (Donovan and Brown). ↩

  8. http://www.darpa.mil/NewsEvents/Releases/2012/09/27.aspx. ↩

  9. “The Information Society, I have concluded, is not so much the result of any recent social change as of increases begun more than a century ago in the speed of material processing. Microprocessor and computer technologies, contrary to currently fashionable opinion, are not new forces only recently unleashed upon an unprepared society, but merely the latest installment in the continuing development of the Control Revolution.” ↩

  10. ‘Robustness: mechanisms and consequences’ by Joanna Masel and Mark L. Siegal. ↩

  11. ‘Dynamic Economics’ by Burton H. Klein. ↩

  12. ‘Productivity and sustainability influenced by biodiversity in grassland ecosystems’ by Tilman et al (1996). ↩

  13. ‘Novelty, Adaptive Capacity, and Resilience’ by Allen and Holling (2010). ↩

  14. ‘Technological pre-adaptation, speciation, and emergence of new technologies: how Corning invented and developed fiber optics’ by Gino Cattani (2006). ↩

  15. ‘Jens Rasmussen’ ↩

  16. ‘Automated to Death’ By Robert N. Charette (2009). ↩

  17. ‘Distributed robustness versus redundancy as causes of mutational robustness’ by Andreas Wagner (2005). ↩

  18. ‘Networked buffering: a basic mechanism for distributed robustness in complex adaptive systems’ by James Whitacre and Axel Bender (2010).  ↩

  19. ‘Strong effects of weak interactions in ecological communities’ by Eric Berlow (1999). ↩

  20. ‘Weak trophic interactions and the balance of nature’ by McCann et al (1998).  ↩

  21. http://en.wikipedia.org/wiki/Keystone_species ↩

Comments

  1. Ashwin, Love your site. I’m thinking along very similar lines. I’m surprised that you don’t highlight Holling’s Panarchy framework (which you’re obviously familiar with, given cite #13) or the Intermediate Disturbance Hypothesis, since both seem right up the alley you are exploring. Or perhaps I just couldn’t find where you discuss them.

*