Tue Feb 18 2020
by John Hill
Models are built to understand systems.
A model is a simplified representation of a system, intended to support understanding of the real system. The process of building models disciplines us to build an understanding of the system we are investigating solutions for.
To make sense of an increasingly complex world, we need increasingly sophisticated models. Just as physical tools and machines extend our physical abilities, models extend our mental abilities, enabling us to understand and control systems beyond our immediate comprehension. While it does take time, effort, and collaboration to develop good models, the impact of the improvement in efficiency and planning, as well as the capabilities around experimentation and control they provide, are transformative. Good models will repay their initial investment many times over, and in ways which may be unexpected ex-ante.
Mathematical or computational models can be used to help us understand the complicated (and, at times, counterintuitive) consequences of our decisions. In a world that changes, computational models fed by data, present new opportunities for improved decision-making in organisations of all sizes, and in all sectors of the economy.
When to Use Models
Outcomes are too high-value to leave decision-making to common sense or experience.
There are too many detailed interactions to keep track of, or outcomes are too interwoven to calculate easily
It is infeasible to experiment with the target system
Knowledge from different sources needs to be synthesised to understand the interactions between systems
Decisions need to be communicated to stakeholders in a way which demonstrates they are well-founded
One needs to be prepared for a range of future scenarios
Indeed, in most cases, the alternative to a computational model is an unexamined future.
Modelling should span many scales, and many levels of resolution.
Complex systems are composed of "systems of systems". Consider the human body. At the microscopic level, it is composed of cells and their interactions. These cells are composed into tissues, which in turn make-up organs. These organs interact with one another to maintain the organism.
Most models today model individual components of a system. Often, however, they do not combine in such a way that the behaviour of the larger system can be composed from the sum of the models of its parts. Models which are designed for modeling systems should be capable of operating at multiple scales.
Relatedly, we should be able to model different levels of details. The increased availability of large-scale and detailed datasets allow for the development of models which account for low-level differences. This already happens in many commercial organisations which collect vast quantities of data on things like customer spending habits. At other times, simple aggregate behaviours provide enough information to answer high-level questions. The ability to model multiple levels of resolution expands the range of problems a model can help to tackle.
Models should be built & maintained by the people who use them
One of the key barriers to the widespread use of computational models is that the skills required to build them, and the knowledge / know-how to understand the system being modeled rarely reside in the same place. The Venn Diagram of computer programmer and subject matter expert rarely intersects! When external data scientists, or computer programmers are brought in to modeling projects, a large communication overhead is introduced and model development becomes slow and costly. Worse still, the models are unlikely to be updated and maintained as the system evolves.
The solution to this problem is that computational models should be built by their users. That is, the people with the business knowledge and know-how should be the ones building models. For this to work, modeling tools need to be significantly easier to use and designed for end-users, not computer programmers.
Models should be a focal point for collaboration
A model represents a single, unambiguous view of the functioning of a system. They expose key judgements and provide a framework for collaborative input. When models are able to span across scales, they provide a framework for system-wide collaboration - as models can be worked on locally, but integrated consistently into a model of the system as a whole. As such, models can be an agent for change in organsiations seeking to break down siloes, and promote shared understanding across teams and geographies.
Models are software
Models are software and they should be built and maintained as such. Changes to models should be audited and tracked. There should be a single source of truth (not multiple versions of the same model floating around organisations). Models should be testable, repeatable and verifiable. Models should be constantly maintained and iteratively developed so that they keep pace with changes in the real-world system they represent.
Models should be visual
Sometimes, one wants to communicate ideas about complex systems, and an illustration or visualisation is a good way of doing this. A well-crafted model can help people to see these complex interactions at work and hence appreciate these complexities better. Visual models are easier to collaborate on. Decisions made with the aid of visual models are easier to communicate.
Models should be interpretable & understandable
There is no commonly agreed upon definition of interpretability in the context of models. One definition I like is that of Miller (2017):
"Interpretability is the degree to which a human can understand the cause of a decision."
The more interpretable a model, the easier it is for someone to understand why certain decisions or predictions have been made. As humans we have mental models of our environment that is updated when something unexpected happens. This update is performed by finding an explanation for the unexpected event. Of course, we do not need explanation for everything that happens. It is ok that we don't understand how our computers work. But when something unexpected happens - our computer shuts down unexpectedly - our curiosity is piqued (often accompanied by frustration!) and we search for explanations.
Back in the good old days, statistical models were easy to interpret. Take the linear regression model. In a univariate linear regression, we have just one dependent variable X whose value changes the value of the independent variable Y according to some linear relationship, the slope of which is governed by a. The term b captures systematic error, or bias, in the relationship.
If we were estimating a relationship between unemployment and output in an economy, we might find that the data show that for every 2% increase in output, we see a 1% fall in unemployment. In economics, this observed relationship is known as Okun's Law. In reality, since this is just an empirically observed relationship, it is probably better described as "Okun's Rule of Thumb", rather than as a "Law", but that's just semantics! Estimating this relationship on US data, we find a relationship close to Okun's law:
% Change GDP = 3.2 - 1.8 * % Change Unemployment Rate
The great thing about the interpretability of this estimated relationship is that we can use it to support decision-making. Let's say we are a government trying to reduce the Unemployment rate by 0.5%. We know that one way to achieve this is to implement policies which will stimulate additional economic growth of roughly 1%. This model helps both as a narrative device, and as a means of communicating the reasoning behind real-world decisions.
The advent of much larger datasets, and the increased compute power available to estimate complex relationships hidden in these data has seen the emergence of so-called "Black Box" statistical models. In our Linear Regression example above, we were estimating the relationship between a single output (% Change GDP) with a single "feature" (% Change Unemployment). In a Neural Network, we look to relate output variables to many features using layers of "Neurons", lke in the image of a simple Neural Network below:
Even for a simple model structure like this, it's impossibe to understand which neuron is playing what role, and which input feature contributes (and in what way) to the model output. Real-world neural networks are orders of magnitude more complex than the image above - often taking in hundreds of features and using millions of neurons arranged in tens, or hundreds of hidden layers. The advantage of these models is that they are able to extract much more complex relationships beween features. Deep Learning has demonstrated an ability to extract higher-level features from the raw input. In image processing, for example, lower layers may identify edges, while deeper layers may identify the concepts relevant to a human such as digits, letters or faces. But understanding how and why a model makes predictions is an ongoing challenge to the deep-learning community, and one that's incredibly hard to solve.
When using models in a real-world setting, the importance of the social aspect of modeling cannot be underestimated. If we intend to use a model as way to support decisions, our ability to use that model to support an update to other peoples' mental models is key. There has been an increasing trend in the use of Black Box models for tasks like image recognition and recommender systems. These models are able to show improved predictive performance relative to simpler statistical models, as a result of their ability to model more complex relationships between a larger feature set. They are also low-risk decisions. Failing to correctly recognise a character in an OCR system, or providing a poor film recommendation is acceptable. Consider instead, higher value decisions. Using a Black-box model to make a decision as to whether someone is guilty of a crime, or that performed a medical diagnosis falls foul of social requirements that model outputs are explainable, interpretable and communicable.
The Role of Data
Data are observations that can be used to provide evidence in support of a model. Data can be used in a number of roles, depending on the nature of the data, and the intended use of the model. If the model is being used to provide explanation or to predict future outcomes of an existing system, then we can use data to validate the model. If, on the other hand, we are using a model to design a system which doesn’t exist (or introduce a new decision into a system), then we use data to validate the system against the model. In other words, we check that the systems is behaving in the same way as the model. There is a further role for data when we know for certain the structure of the model, but do not know the bounds, or optimal values, of some parameters. In this case, we can use data to finetune parameters such as the duration or speed of an event.
Models should be used to support decisions
When the complexity of systems surpasses human ability to comprehend them, models are the only way to understand them. We no longer lift heavy loads by manpower alone, and nor should we expect to lift heavy cognitive loads by brainpower alone. The limits of our brainpower are suprisingly narrow. Miller's Law states that our capacity for holding and processing information is bounded at 7 ± 2 objects in short-term memory. With the aid of computers, we can expand this to many hundreds of objects.
Nevertheless, commerce and much of human life may depend on the properties of the complex systems we have constructed, and models can undoubtedly help us to make difficult decisions in areas that include policymaking, commerce, healthcare, industry, finance, retail, and more.
Models will help to train computers
Deep reinforcement learning (DRL) is one of the most exciting fields in AI right now. Bringing intelligence to autonomous systems at scale will require a unique combination of the new practice of machine teaching, advances in deep reinforcement learning and leveraging simulation for training.
When computers learn from real-world data, they suffer from a fundamental limitation: they must see enough examples from the real world in order to build a model that generalises effectively. Unfortunately, we live in a world of fat-tails. Catastrophic events are rare. They are low-probability, high impact. Machines trained without exposure to enough rare events, be they stock market crashes, car crashes, or cyber attacks won't be robust to these scenarios. No matter how big a statistical hammer we hit real-world data with, we cannot overcome this limitation. Simulators can be used to overcome this deficiency - we can train statistical models on the data they generate.
Computational modelling capabilities are passing beyond a tipping point. The availability of high-performance compute coupled with the increased availability of high-quality data post-big data revolution has laid the foundations for organisations to build transformative models.