For all the recent excitement about the potential of Artificial Intelligence (AI) to help public sector organisations, there are growing concerns that it could be used in ways that cause harm, unfairness, and moral wrongs.
These concerns tend to focus on one of three stages of deploying an AI. First, how the AI is created (Creation). Second, how the AI works (Function). And third, what the AI is used to do (Outcome).
In response, dozens of organisations from Google to the European Commission have proposed codes for the responsible use of AI. There’s striking similarity in their recommendations.
At the Creation stage, common calls are to publish and minimise bias in the training data; respect privacy; avoid using data on sensitive factors such as race and religion; and ensure data is handled in compliance with data protection rules.
"Dozens of organisations from Google to the European Commission have proposed codes for the responsible use of AI"
At the Function stage, recommendations suggest making the code of an AI transparent and open for inspection; minimising bias and limitations in the AI’s assumptions; ensuring the function of the AI can be explained; and protecting it from manipulation.
And at the Outcome stage, suggestions involve ensuring that an AI’s intended and actual outcomes are fair, transparent, legal, and aligned with human values, and that its outcomes can be explained.
These codes are well-intentioned. However, many cease to be meaningful as the complexity of an AI increases.
To explain this, imagine there are three broad levels of complexity of AI, with Level 1 being the simplest and Level 3 being the most complex.
A Level 1 AI might involve using a few structured datasets to correctly weight factors that are deemed to be important by humans. For example, firefighters could be asked to list the factors relevant to a building’s fire risk. Datasets can then be sought that relate to those factors in order to train an AI. Machine learning is used to weight the factors according to the extent they are predictive of a high-risk building.
At Level 2, an AI can decide for itself the relevance and the weighting of the factors and how they lead to a given outcome. For example, a city agency could use machine learning to analyse thousands of free-text case notes to spot patterns that predict which vulnerable children are most likely to be taken into care in future.
At Level 3, the model is continuously updated based on new data, which will often be unstructured and unlimited. Imagine a police surveillance system that constantly analyses surveillance camera footage and sound data from dozens of train stations in order to spot suspicious behaviour.
Viewed like this, it’s clear that Level 2 and 3 applications make certain recommendations problematic, if not impossible.
At the Creation stage, being transparent about the training data is possible for Levels 1 and 2, which use a finite number of datasets. However, it’s much harder at Level 3, where the data is unlimited and constantly changing. At both Levels 2 and 3, analysis for biases is extremely hard if the data is vast and unstructured. (For example, an algorithm could generate a model which has learned the race of an individual from other features, which are then used to predict an outcome.)
At the Function stage, making the code of an AI ‘open’ won’t achieve meaningful transparency at Levels 2 and 3. For even if viewable, the code will be uncheckable if it’s highly complex, where the model continuously changes based on live data, or where the use of neural networks means there is no single ‘point of decision making’ to view.
Finally, at the Outcome stage, recommendations for explanation and accountability for the outputs of an AI are problematic if no one is able to understand the inner workings of the AI at the Function stage.
How should we respond to these flaws? I propose we ditch the codes and choose the harder but more effective path of educating and then trusting in the professional judgement of public sector staff. After all, the specific contexts in which an AI is used make a huge difference to the potential risks.
To achieve that, staff need to be supported to ask the right questions. I offer the 10 below.
The point is not that there’s one set of ‘right’ answers. Rather, I think it would be unacceptable to deploy an AI in a live environment without first having an answer to these questions.
For those questions that are difficult to handle in more complex instances of AI, public sector staff need to make a judgement call on whether the lack of detail is reason enough not to proceed with using a particular AI.
They might, for example, decide that for a particularly sensitive domain, such as child welfare, they are not willing to use an AI without being able to fully explain its operation, from training data to outcome. My point is that setting that as a hard rule for all contexts (as many codes imply), would prohibit the public sector from benefiting from more advanced, Level 3, forms of AI. I’d rather staff at the coalface of an issue make that call, rather than a distant policymaker.
A vital competence of being a public servant is about being able to use the right technologies in the right contexts in the right way to further their professionalism and effectiveness.
Let’s choose to empower them to do so by giving them the advice and guidance they need and then trust them to get it right.